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RECEPTOR OF THE THYROID/STEROID HORMONE RECEPTOR SUPERFAMILY 

FTELD OF TWT; TNVENTION 

The present invention relates to novel steroid> 
hormone or steroid-hormone like receptor proteins, gen s. 
•5 encoding such proteins, and methods of making and using, 
such proteins. In a particular aspect, the present 
invention relates to bioassay systems for determining the 
selectivity of interaction between ligands and steroid- 
hormone or steroid-hormone like receptor proteins. 
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R&CKGROUND OF THE T 



Transcriptional regulation of development" and 
homeostasis in complex eukaryotes, including humans and 

15 other mammals, birds, fish, insects, and the like, is 
controlled by a wide variety of regulatory substances, 
including steroid and thyroid hormones. These hormones 
exert potent effects on development and differentiation of 
phylogenetically diverse organisms. The effects, of 

20 hormones are mediated by interaction with specific, high 
affinity binding proteins referred to as receptors. 

The ability to identify additional compounds 
which are able to affect transcription of genes which are 

25 responsive to steroid hormones or metabolites thereof, 
would be of significant value in identifying compounds, of 
potential therapeutic use. Further, systems useful for 
monitoring solutions, body fluids, and the like, for the 
presence of steroid hormones or metabolites thereof, would 

30 be of value in medical diagnosis, as well as for various 
biochemical applications. 

A number of receptor proteins, each specific for 
one of several classes of cognate steroid hormones [e.g., 
35 estrogens (estrogen receptor) , progesterones (progesteron 
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receptor) , glucocorticoid (glucocorticoid recept r) , 
androgens (androgen receptor) , aldosterones 
~(mi^ vitamin "D (vit^in D 

receptor)], retinoids (e.g. , retinoic acid receptor) or for 
5 cognate thyroid hormones (e.g., thyroid hormone receptor), 
are known. Receptor proteins have been found to be 
distributed throughout the cell population of complex 
eukaryotes in a tissue specific fashion. 

10 Molecular cloning studies have made it possible 

to demonstrate that receptors for steroid, retinoid and 
thyroid hormones are all structurally related and comprise 
a superfamily of regulatory proteins. These regulatory 
proteins are capable of modulating specific gene expression 

15 in response to hormone stimulation by binding directly to 
cis-acting elements. Structural comparisons and functional 
studies with mutant receptors have revealed that these 
molecules are composed of a series of discrete functional 
domains, most notably, a DNA-binding domain that is 

20 composed typically of 66-68 amino acids, including two zinc 
fingers and an associated car boxy terminal stretch of 
approximately 250 amino acids, which latter region 
comprises the ligand-binding domain. 

25 An important advance in the characterization of 

this superfamily of regulatory proteins has been the 
delineation of a growing list of gene products which 
possess the structural features of hormone receptors. This 
growing list of gene products has been isolated by low- 

30 stringency hybridization techniques employing DNA sequences 
encoding previously identified hormone receptor proteins. 

It is known that steroid or thyroid hormon s, 
protected forms ther f, or metabolites ther of, nter 
35 cells and bind to the corresponding specific receptor 
prot in, initiating an allosteric alteration of the 
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protein. As a result of this alteration, the complex of 
receptor and hormone (or metabolite ther of) is capable of 
—binding- to-certain specif ic Jsites _on_chromatin_j!Lith_high_ 
affinity. 

5 

It is also known that many of the primary effects 
of steroid and thyroid hormones involve increased 
transcription of a subset of genes in specific cell types. 

10 A number of steroid hormone- and thyroid hormone- 

responsive transcriptional control units have been 
identified. These include the mouse mammary tumor virus 
5 '-long terminal repeat (MTV LTR) , responsive^ to 
glucocorticoid, aldosterone and androgen hormones; . the 

15 transcriptional control units for mammalian growth hormone 
genes, responsive to glucocorticoids, estrogens and thyroid 
hormones; the transcriptional control units for mammalian 
prolactin genes and progesterone receptor genes, responsive 
to estrogens; the transcriptional control units for avian 

20 ovalbumin genes, responsive to progesterones ; mammalian 
metallothionein gene transcriptional control units, 
responsive to glucocorticoids; and mammalian hepatic o 2u - 
globulin gene transcriptional control units, responsive to 
androgens, estrogens, thyroid . hormones, and 

25 glucocorticoids. 

A major obstacle to further understanding and 
more widespread use of the various members of the 
steroid/ thyroid superfamily of hormone receptors has been 

30 a lack of availability of the receptor proteins, in 
sufficient guantity and sufficiently pure form, to allow 
them to be adeguately characterized. The same is true for 
th DNA gene segments which encode them. Lack of 
availability of these DNA s gments has pr v nted ill ylfcEg 

35 manipulation and in vivo expression of the receptor- 
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encoding genes , and consequently the knowledge such 
manipulation and expression would yield. 



In addition, a further obstacle to a more 
5 complete understanding and more widespread use of members 
of the steroid/thyroid receptor superfamily is the fact 
that additional members of this superfamily remain to be 
discovered , isolated and characterized. 

10 The present invention is directed to overcoming 

these problems of short supply of adequately purified 
receptor material , lack of DNA segments which encode such 
receptors and increasing the number of identified gjid 
characterized hormone receptors which are available for 

15 use. 

BRIEF DESCRIPTION OF THE INVENTION 

In accordance with the present invention, we have 
20 discovered novel members of the steroid/ thyroid superfamily 
of receptors. The novel receptors of the present invention 
are soluble, intracellular, nuclear (as opposed to cell 
stir face) receptors, which are activated to modulat 
transcription of certain genes in animal cells when the 
25 cells are exposed to ligands therefor. The nuclear 
receptors of the present invention differ significantly 
from known steroid receptors, both in primary sequence and 
in responsiveness to exposure of cells to various ligands, 
e.g., steroids or steroid-like compounds. 

30 

Also provided in accordance with the present 
invention are DNAs encoding the receptors of the present 
invention, including expression vectors for expression 
th reof in animal cells, c lis transformed with such 
35 expression vectors, cells co-transformed with such 
expression vectors and reporter vectors (to monitor the 
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ability of the receptors to modulate transcription when the 
cells are exposed to a compound which interacts with the 
receptor) ; and methods of -using such- co-transformed cells 
in screening for compounds which are capable of leading to 
5 modulation of receptor activity. 

Further provided in accordance with the present 
invention are DNA and RNA probes for identifying DNAs 
encoding additional steroid receptors. 

10 

In accordance with yet another embodiment of the 
invention, there is provided a method for making the 
receptors of the invention by expressing DNAs which encode 
the receptors in suitable host organisms. 

15 

The novel receptors and DNAs encoding same can be 
employed for a variety of purposes. For example, novel 
receptors of the present invention can be included as part 
of a panel of receptors which are screened to determine the 

20 selectivity of interaction of proposed agonists or 
antagonists and other receptors. Thus, a compound which is 
believed to interact selectively, for example, with the 
glucocorticoid receptor, should no£ have any substantial 
effect on any other receptors, including those of th 

25 present invention. Conversely, if such a proposed compound 
does interact with one or more of the invention receptors, 
then the possibility of side reactions caused by such 
compound is clearly indicated. 

30 BRIEF DESCRIPTION nv TTTT! FIGURE 

Figure 1 is a schematic diagram correlating the 
relationship between the alt mate spliced variants of 
invention receptor XRl. 

35 
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n ^ T T/ pr> nRSCRTPT T ^ QE 32E INVENTION . 

a -acMrd^c^vrith-the-presen^ 

are provided DMAs encoding a polypeptide characterized by 
5 having a DNA binding domain comprising about 66 amino acids 
with 9 cysteine (Cys) residues, wherein said DNA binding 
domain has: 

(i) less than about 70% amino acid seguence 
identity with the DNA binding domain of 

10 human retinoic acid receptor- alpha (hRAR- 

alpha) ,* 

(ii) less than about 60% amino acid seguence 
identity with the DNA binding domain, of 
human thyroid receptor-beta (hTR-beta) ; 

15 ( iii) less than about 50% amino acid seguence 

identity with the DNA binding domain of 
human glucocorticoid receptor (hGR)'; and 
(iv) less than about 65% amino acid seguence 
identity in with the DNA binding domain of 

20 human retinoid X receptor-alpha (hRXR- 

alpha) . 

Alternatively, DNAs of the invention can be 
characterized with respect to percent amino acid segu nee 

25 identity of the ligand binding domain of polypeptides 
.encoded thereby, relative to amino acid sequences f 
previously characterized receptors. As yet another 
alternative, DNAs of the invention can be characterized by 
the percent overall amino* acid seguence identity of 

30 polypeptides encoded thereby, relative to amino acid 
seguences of previously characterized receptors. 

Thus, DNAs of the invention can be characterized 
as encoding polypeptides having, in the ligand binding 
35 d main: 
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(i) less than about 35% amino acid sequence 
identity with the ligand binding domain 
-of— hRAR-aipha-;- 



(ii) less than about 30% amino acid segu nee 
5 identity with the ligand binding domain 

of hTR-beta ; 
(iii) less than about 25% amino acid seguence 
identity with the ligand binding domain 
of hGR; and 

10 (iv) less than about 30% amino acid seguence 

identity with the ligand binding domain 
of hRXR-alpha. 

DNAs of the invention can be further 
15 characterized as encoding polypeptides having an overall 
amino acid seguence identity of: 

(i) less than about 35% relative to hRAR- 
alpha; 

(ii) less than about 35% relative to hTR- 
20 beta; 

(iii) less than about 25% relative to hGR; 
and 

(iv) less than about 35% relative to hRXR- 
alpha. 

25 . 

Specific receptors contemplated for use in the 
practice of the present invention include: 

"XRl" (variously referred to herein as receptor 
30 "XRl", "hXRl", "hXRl.pep" or "verHT19.pep" ; 

wherein the prefix "h" indicates the clone is of 
human origin) , a polypeptide characterized as 
having a DNA binding domain comprising: 

(i) about 68% amino acid seguence identity 
35 with the DNA binding domain of 

hRAR-alpha ; 



JPCT/US92/07570 



-8- 

(ii) about 59% amino acid sequence identity 
with the DNA binding domain of 

HTR-beta;" -~ 

(iii) about 45% amino acid sequence identity 
with the DNA binding domain of hGR; and 
(iv) about 65% amino acid sequence identity 
with the DNA binding domain f 
bRXR-alpha; 

see also Sequence ID No. 2 for a specific amino 
acid sequence representative of XR1, as well as 
sequence ID No. 1 which is an exemplary 
nucleotide sequence encoding XR1. In addition, 
sequence ID Nos. 4 and 6 present alternate amino- 
terminal, sequences for the clone referred to as 
XR1 (the variant referred to as verht3 is 
presented in Sequence ID No. 4 (an exemplary 
nucleotide sequence encoding such variant 
presented in Sequence ID No. 3), and the variant 
referred to as verhr5 is presented in Sequence ID 
No. 6 (an exemplary nucleotide sequence encoding 
such variant presented in Sequence ID No. 5) ; 

"XR2" (variously referred to herein as receptor 
«XR2", »hXR2« or «hXR2.pep»), a polypeptide 
characterized as having a DNA binding domain 
comprising: 

(i) about 55% amino acid sequence identity 
with the DNA binding domain of 
hRAR-alpha; 

(ii) about 56% amino acid sequence identity 
with the DNA binding domain of 
hTR-beta; 

(iii) about 50% amino acid sequence identity 
with the DNA binding domain of hGR; and 
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(iv) about 52% amino acid sequence identity 
with the DNA binding domain of 
hRXR^alpha; 



see also Sequence ID No. 8 for a specific amino 
5 acid sequence representative of XR2, as well as 

Sequence ID No. 7 which is an exemplary 
nucleotide sequence encoding XR2; 

"XR4" (variously referred to herein as receptor 
10 W XR4 M , M mXR4» or M mXR4.pep"; wherein the prefix 

"m M indicates the clone is of mouse origin) , a 
polypeptide characterized as having a DNA binding 
domain comprising: 

(i) about 62% amino acid sequence identity 
15 with the DNA binding domain of 

hRAR-alpha; 

(ii) about 58% amino acid sequence identity 
with the DNA binding domain of 
hTR-beta; 

20 (iii) about 48% amino acid sequence identity 

with the DNA binding domain of hGR; and 
(iv) about 62% amino acid sequence identity 
with the DNA binding domain of 
hRXR-alpha; 

25 see also Sequence ID No. 10 for a specific amino 

acid sequence representative of XR4, as well as 
Sequence ID No. 9 which is an exemplary 
nucleotide sequence encoding XR4 ; 



35 



"XR5" (variously referred to herein as receptor 
"XR5", "mXR5" or "mXR5.pep") , a polypeptide 
characterized as having a DNA binding domain 

comprising: 

(i) about 59% amino acid sequence identity 
with the DNA binding domain of 
hRAR-alpha ; 
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(ii) about 52% amino acid sequence identity 
with the DNA binding domain of 
- hTR-blata ?" ~ 
(iii) about 44% amino acid sequence identity 
with the DNA binding domain of hGR; and 
(iv) about 61% amino acid sequence identity 
with the DNA binding domain of 
hRXR-alpha; 

see also Sequence ID No. 12 for a specific amino 
acid sequence representative of XR5, as well as 
Sequence ID No. 11 which is an exemplary 
nucleotide sequence encoding XR5; and 

"XR79" (variously referred to herein as W XR79" , 
15 "dXR79" or "dXR79.pep"; wherein the prefix "d" 

indicates the clone is of Drosophila origin) , a 
polypeptide characterized as having a DNA binding 
domain comprising: 

(i) about 59% amino acid sequence identity 
20 with the DNA binding domain of 

hRAR-alpha; 

(ii) about 55% amino acid sequence identity 
with the DNA binding domain of 
hTR-beta; 

25 (iii) about 50% amino acid sequence identity 

with the DNA binding domain of hGR; and 
(iv) about 65% amino acid sequence identity 
with the DNA binding domain of 
hRXR-alpha; 

30 see also Sequence ID No. 14 for a specific amino 

acid sequence representative of XR79 , as well as 
Sequence ID No. 13 which is an exemplary 
nucleotide sequence encoding XR79. 



35 



The receptor referred to herein as M XR1" is 
observed as three closely related proteins, presumably 
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produced by alternate splicing from a single gene, 
first of these proteins to be characterized (referred to as 
»verhtl9 M r"comprises about- 54 8 -amino -acids ;_.and_has_a\M r of 
about 63 kilodalton. Northern analysis indicates that a 
single mRNA species corresponding to XRl is highly 
expressed in the brain. A variant of verhtlS 
(alternatively referred to as «verht3«, XRl' or XRlprim > 
is further characterized as comprising about 556 amino 
acids, and having a M r of about 64 kilodalton. Yet another 
variant of verhtlS (alternatively referred to as "verhr5«, 
XRl" or XRlprim2) is further characterized as comprising 
about 523 amino acids, and having a M P of about 60 
kilodalton. The interrelationship between, these three 
variants of XRl is illustrated schematically in Figure 1.. 
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The receptor referred to herein as H XR2" is 
further characterized as a protein comprising about 440 
amino acids, and having a M, of. about 50 kilodalton. 
Northern analysis indicates that a single mRNA species 
20 (-1.7 kb) corresponding to XR2 is expressed most highly in 
liver, kidney, lung, intestine and adrenals of adult male 
rats. Transactivation studies (employing chimeric 
receptors containing the XR2 DNA binding domain and the 
ligand binding domain of a prior art receptor) indicate 
25 that XR2 is capable of binding to TRE^. In terms of amino 
acid sequence identity with prior art receptors, XR2 is 
most closely related to the vitamin D receptor (39% overall • 
amino acid sequence identity, 17% amino acid identity in 
the amino terminal domain of the receptor, 53% amino acid 
30 identity in the DNA binding domain of the receptor and 37% 
amino acid identity in the ligand binding domain of the 
receptor) . 

The receptor ref rr d to herein as M XR4» is 
35 further characterized as a protein comprising about 439 
amino acids, and having a M r of about 50 kilodalton. In 
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terms of amino acid sequence identity with prior art 
receptors, XR4 is most closely related to the peroxisome 
"prolTf erai^-act-iva^ 

sequence identity, 30% amino acid identity in the amino 
terminal domain of the receptor, 86% amino acid identity in 
the DNA binding domain of the receptor and 64% amino' acid 
identity in the ligand binding domain of the receptor) . 
XR4 is expressed ubiquitously and throughout development 
(as determined by in situ hybridization) . 



The receptor referred to herein as M XR5" is 
further characterized as a protein comprising about 556 
amino acids, and having a M r of about 64 kilodalton. Jn 
situ hybridization reveals widespread expression throughout 
15 development- High levels of expression are observed in the 
embryonic liver around day 12, indicating a potential role 
in haematopoiesis. High levels are also found in maturing 
dorsal root ganglia and in the skin. In terms of amino 
acid sequence identity with prior art receptors, XR5. is 
20 most closely related to the rat nerve growth factor induced 
protein-B (NGFI-B) receptor. With respect to NGFI-B, XR5 
has 29% overall amino acid sequence identity, 15% amino 
acid identity in the amino terminal domain of the receptor, 
52% amino acid identity in the DNA binding domain of the 
25 receptor and 29% amino acid identity in the ligand binding 
domain of the receptor. 

The receptor referred to herein as "XR79" is 
further characterized as a protein comprising about 601 

30 amino acids, and having a M r of about 66 kilodalton. Whole 
mount in situ hybridization reveals a fairly uniform 
pattern of RNA expression during embryogenesis. Northern 
blot analysis indicates that a 2.5 kb transcript 
corresponding to XR79 is present in RNA throughout 

35 development. The levels of XR79 mRNA are highest in RNA 
from 0-3 hour old embryos, i.e., maternal product, and 
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lowest in RNA from the second instar larvae (L2 stage) . In 
situ hybridization reveals that XR79 is distributed 

-rel-at-ive-iy-un^^ 



in terms of amino acid sequence identity with prior art 
receptors, XR79 is most closely related to the mammalian 
receptor TR2 [see Chang and Kokontis in Biochemical and 
Biophysical Research Communications 15J.: 971-977 (1988)], 
as well as members of the coup family, i.e., ear 2, 
coup(ear3), harp-1. With respect to TR2, XR79 has 33% 
10 overall amino acid sequence identity, 16% amino acid 
identity in the amino terminal domain of the receptor, 74% 
amino acid identity in the DNA binding domain of the 
receptor and 28% amino acid identity in the ligand binding 
domain of the receptor. With respect to coup (ear3) {see 
15 Miyajima et al. , in Nucl Acids Res 16: 11057-11074 (1988) ] , 
XR79 has 32% overall amino acid sequence identity, 21% 
amino acid identity in the amino terminal domain of the 
receptor, 62% amino acid identity in the DNA binding domain 
of the receptor and 22% amino acid identity in the ligand 
20 binding domain of the receptor. 

In accordance 'with a specific embodiment of the 
present invention, there is provided an expression vector 
which comprises DNA as previously described (or functional 
25 fragments thereof) , and which further comprises: 

at the 5 '-end of said DNA, a promoter and a 
nucleotide triplet encoding a translational start codon, 
and 

at the 3 '-end of said DNA, a nucleotide 
30 triplet encoding a translational stop codon; 

wherein said expression vector is operative in a 
cell in culture (e.g., yeast, bacteria, mammalian) to 
expr ss the protein ncoded by said DNA. 



35 



As employed herein, reference to "functional 
fragments" embraces DNA encoding portions of the invention 




it 

•t 

i ■ . 
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„_ _ wnre «f the functional 
receptors which retain one or more or ™ 

characteristics of steroid hormone or steroid hormone-lUce 
receptors, e.g., DNA binding propertriTels - of~such-reeeptors-, - 
ligand binding properties of such receptors, the ability to 
heterodimerize, nuclear localization properties of such 
receptors, phosphorylktion properties of such receptors, 
transactivation domains characteristic of such receptors, 
and the like. 

in accordance with a further embodiment of th 
present invention, there are provided cells in culture 
(e.g., yeast, bacteria, mammalian) which are transformed 
with the above-described expression vector. 

In accordance with yet another embodiment of the 
present invention, there is provided a method of making the 
above-described novel receptors (or functional fragments 
thereof) by culturing the above-described cells under 
conditions suitable for expression of polypeptide product. 

In accordance with a further embodiment of the 
present invention, there are provided novel polypeptide 
products produced by the above-described method. 

in accordance with a still further embodiment of 
the present invention, there are provided chimeric 
receptors comprising at least an amino-terminal domain, a 
DNA-binding domain, and a ligand-binding domain, 

wherein at least one of the domains thereof 
is derived from the novel polypeptides of the 
present invention; and 

wherein at least one of the domains thereof 
is derived from at least one pr viously 
identified memb r of the steroid/thyr id 
• superfamily of receptors e.g., glucocorticoid 
receptor (GR) , thyroid receptors (TR) , retinoic 
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acid receptors (RAR) , mineralocorticoid rec ptor 
(MR) , estrog n receptor (ER) , the estrogen 

related- — rec ptors — (evg^r — -hERR-1— or — hERR2)-^- 

retinoid X receptors (e.g., RXRa f RXRfi or RXR5) # 
5. vitamin D receptor (VDR) , aldosterone receptor* 

(AR) , progesterone receptor (PR) , the 
ultraspiracle receptor (USP) , nerve growth factor 
induced protein-B (NGFI-B) , the coup family of 
transcription factors (COUP) , peroxisome 

10 prolif erator-activated receptor (PPAR) , mammalian 

receptor TR2 (TR2), and the like. 

In accordance with yet another embodiment of the 
present invention, there is provided a method of. using 
15 polypeptides of the invention to screen for response 
elements and/ or ligands for the novel receptors described 
herein. The method to identify compounds which act as 
ligands for receptor polypeptides of the invention 
comprising: 

20 assaying for the presence or absence of reporter 

protein upon contacting of cells containing a chimeric form 
of said receptor polypeptide and reporter vector with said 
compound; 

wherein said chimeric form of said receptor 
25 polypeptide comprises the ligand binding domain of said 
receptor polypeptide and the amino- terminal and DNA-binding 
domains of one or more previously identified members of the 
steroid/ thyroid superfamily of receptors; 

wherein said reporter vector comprises: 
30 (a) a promoter that is operable in said 

cell, 

(b) a hormone response element which is 
responsive to the receptor from which 
the DNA-binding domain of said chim ric 
35 form of said receptor polypeptide is 

derived, and 
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(c) a DNA segment encoding a reporter 
protein, 

wherein said reporter protein- 
encoding DNA segment is operatively 
5 linked to said promoter for 

transcription of said DNA segment, and 
wherein said hormone response 
element is operatively linked to said 
promoter for activation thereof, and 
10 thereafter 

identifying those compounds which induce or block 
the production of reporter in the presence of said chimeric 
form of said receptor polypeptide. 

15 T he method to identify response elements for 

receptor polypeptides of the invention comprises: 

assaying for the presence or absence of reporter 
protein upon contacting of cells containing a chimeric form 
of said receptor polypeptide and reporter vector with a 
20 compound which is a known agonist or antagonist for the 
receptor from which the ligand-binding domain of said 
chimeric form of said receptor polypeptide is derived; 

wherein said chimeric form of said receptor 
polypeptide comprises the DNA-binding domain of the 
25 receptor polypeptide and the amino-terminal and 
ligand-binding domains of one or more previously identified 
members of the steroid/thyroid superfamily of receptors; 
wherein said reporter vector comprises: 

(a) a promoter that is operable in said 

30 cell, 

(b) a putative hormone response element, 

and 

( C ) a DNA segment encoding a reporter 
protein, 

35 wherein said reporter protein- 

encoding DNA segment is operatively 
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linked to said promoter for 
transcription of said DNA segm nt, and 

—wherein — said— hormone — response — 

element is operatively linked to said 
5 promoter for activation thereof; and 

identifying those response elements for which the 
production of reporter is induced or blocked in the 
presence of said chimeric form of said receptor 
polypeptide. 

10 

In accordance with yet another embodiment of the 
present invention, there is provided a DNA or RNA labeled 
for detection; wherein said DNA or RNA comprises a nupleic 
acid segment, preferably of at least 20 bases in length, 

15 wherein said segment has substantially the same sequence as 
a segment of the same length selected from the DNA segment 
represented by bases 21 -1902, inclusive, of Sequence ID 
No. 1, bases 1 - 386, inclusive, of Sequence ID No. 3, 
bases 10 - 300, inclusive, of Sequence ID No. 5, bases 

20 21 - 1615, inclusive, of Sequence ID No. 7, bases 
21 - 2000, inclusive, of Sequence ID No. 9, bases 1 - 2450, 
inclusive, of Sequence ID No. 11, bases 21 - 2295, 
inclusive, of Sequence ID No. 13, or the complement of any 
of said segments. 

In accordance with still another embodiment of 
the present invention, there are provided methods of 
testing compound (s) for the ability to regulate 
transcription-activating effects of a receptor polypeptide, 
30 said method comprising assaying for the presence or abs nee 
of reporter protein upon contacting of cells containing a 
•receptor polypeptide and reporter vector with said 
compound; 

wherein said receptor polypeptide ■ is 
35 character iz d by having a DNA binding domain comprising 
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about 66 amino acids with 9 Cys residues, wherein said DNA 

binding domain has: - mianr , o 
- " (i) leS s than about 70% ammo acid s quence 

identity with the DNA binding domain of 
5 hRAR-alpha; 

(ii) less than about 60% amino acid sequence 
identity with the DNA binding domain of 
hTR-beta; 

(iii) less than about 50% amino acid sequence 
10 ^ identity with the DNA binding domain of hGR; 

and 

(iv) less than about 65% amino acid sequence 
identity with the DNA binding domain, of 
hRXR-alpha? and 
15 wherein said reporter vector comprises: 

(a) a promoter that is operable in said cell, 

(b) a hormone response element, and 

(c) a DNA segment encoding a reporter protein, 
wherein said reporter protein-encoding DNA segment is 

20 operatives linked to said promoter for transcription of 

said DNA segment, and 

wherein said hormone response element is operatxvely 
linked to said promoter for activation thereof. 

in accordance with a still further embodiment of 
the present invention, there is provided a method of 
testing a compound for its ability to selectively regulate 
the transcription-activating effects of a specific receptor 
polypeptide, said method comprising: 

assaying for the presence or absence of reporter 
protein upon contacting of cells containing said receptor 
polypeptide and reporter vector with said compound; 

wherein said receptor polypeptide is 
characterized by being responsive to the pr sence of a 
35 known ligand for said r ceptor t regulate the 
transcription of associated gene(s) ; 



25 



30 



WO 93/06215. 



-19- 



PCT/US92/07570 



wherein said reporter vector comprises: 

(a) a promoter that is operable in said 
cell, 

(b) a hormone response element , and 

5 (c) a DNA segment encoding a reporter 

protein, 

wherein said reporter protein- 
encoding DNA segment is operatively 
linked to said promoter for 
10 transcription of said DNA segment, and 

wherein said hormone response 
element is operatively linked to said 
promoter for activation thereof; and 
assaying for the presence or absence of reporter 
15 protein upon contacting of cells containing chimeric 
receptor polypeptide and reporter vector with said 
compound; 

wherein said chimeric receptor polypeptide 
comprises the ligand binding domain of a novel 
20 receptor of the present invention, and the" DNA 

binding domain of said specific receptor; and 
thereafter 

selecting those compounds which induce or block 
the production of reporter in the presence of said specific 
25 receptor, but are substantially unable to induce or block 
•the production of reporter in the presence of said chimeric 
receptor. 

The above-described methods of testing compounds 
30 for the ability to regulate transcription-activating 
effects of invention receptor polypeptides can be carried 
out employing methods described in USSN 108,471, filed 
October 20, 1987, the entire contents of which are hereby 
incorporated by reference herein. 

35 
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As employed herein, the term "expression vector 
refers to constructs containing DNA of the invention (or 

- -5uSioi^ f»irent^th-er^f r. plus-al-r-sequences-ne^sa^- 
for stipulation and expression of such DNA. Such an 

5 expression vector will contain both a "translatxonal start 
site" and a "translational stop site". Those of skill » 
the art can readily identify sequences which act as either 
translational start sites or trauslational stop sites. 

10 suitable host cells for use in the practice of 

the present invention include prokaroytic and eukaryote 
ceils, e.g. , bacteria, yeast, sas-alian cells and the like. 

Labeled DNA or RNA contemplated for use in the 
15 practice of the present invention comprises nucleic acid 
sequences covalently attached to readily analyz able species 
such as, for example, radiolabel (e.g., *P, H, S, and the 
like) , enzymatically active label, and the like. 

20 The invention will now be described in greater 

detail by reference to the following non-limiting examples. 

FYAMPLES 

25 EXAMPLE I 

ISOLATION AND CHARACTERIZATION OF XR1 

The Kpnl/Sacl restriction fragment (503bp) 
including the DNA-binding domain of hRAR-alpha-encoding DNA 

30 [See Giguere et al., Nature 330: 624-629 (1987); and 
commonly assigned United States Patent Application Serial 
No. 276,536, filed November 30, 1988; and European Patent 
Application Publication No. 0 325 849, all incorporated 
herein by reference} was nick-translated and used to scr en 

35 a rat brain cDNA library [see DNA Cloning, A practical 
approach. Vol I and II, D. *. Glover, ed. (IRL Press 
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(1985)] and a lambda-gtll human liver cDNA library [Kwok et 
al., Biochem. 24: 556 (1985)] at low stringency. The 

hybridization — mixture contained 35% formaiaide, IX 

Denhardt's, 5X SSPE (IX SSPE=0.15 M NaCl, lOmM NajHPC^ ImM 
5 EDTA) , 0.1% SDS, 10% dextran sulfate, 100 /xg/ml denatured 
salmon sperm DNA and 10 6 cpm of [ 32 P] -labelled probe. 
Duplicate nitrocellulose filters were hybridized for 16h at 
42°C, washed once at 25°C for 15 min with 2X SSC (IX 
SSO0.15 M NaCl, 0.015 M sodium citrate), 0.1% SDS and then 

10 washed twice at 55°C for 30 min. in 2X SSC, 0.1% SDS. The 
filters were autoradiographed for 3 days at -70 °C using an 
intensifying screen. 

After several rounds of screening, a . pure 
15 positive clone having an insert of about 2.1 kb is obtained 
from the rat brain cDNA library. Several positive clones 
are obtained from the human liver library. Sequence 
analysis of the positive rat brain clone indicates that 
this clone encodes a novel member of the steroid/thyroid 
20 superfamily of receptors. Sequence analysis of one of the 
positive human liver clones (designated "hLl", a 1.7 kb 
cDNA) indicates that this clone is the human equivalent of 
the rat brain clone , based on sequence homology. 

25 The EcoRI insert of clone hLl (labeled with 32 P) 

is also used as a probe to screen a human testis cDNA 
library (Clonetech) and a human retina cDNA library [see 
Nathans et al., in Science 232 : 193-202 (1986)]. 
Hybridization conditions comprised a hybridization mixture 

30 containing 50% formamide, IX Denhardt's, 5X SSPE, 0.1% SDS,. 
100 ng/ml denatured salmon sperm DNA and 10 6 cpm of [ 32 P]- 
labelled probe. Duplicate nitrocellulose filters were 
hybridized for 16h at 42°C, washed once at 25°C for 15 min 
with 2X SSC (IX SSC=0.15 M NaCl, 0.015 M sodium citrate), 

35 0.1% SDS and then washed twice at 55 °C for 3 0 min. in 2X 
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SSC/ 0.1% SDS. The filters were autoradiographed for 3 
days at -70»c using an intensifying screen. . 

After several rounds of screening, five (5) 

5 positive clones were obtained from the human retina cDNA 
library, and five (5) positive clones were obtained from 
the human testis cDNA library. Sequence analysis of two 
clones from the testis library indicates that these clones 
encode different isoforms of the same novel member of the 

10 steroid/thyroid superfamily of receptors (designated as 
«Verhtl9" and «Verht3") . Sequence analysis of one of the 
positive clones from the human retina library indicates 
that this clone is yet another isof orm of the same novel 
member of the steroid/ thyroid superfamily of receptors 

15 (designated -VerhrS") . The full length sequence of Verhtl9 
is set forth herein as Sequence ID No. 1 (which includes an 
indication of where the splice site is for each of the 
variants, verht3 and verhrS) . The amino-terminal sequenc 
of verht3 and verhrS are presented in Sequence ID Nos. 3 

20 and 5, respectively. In addition, the interrelationship 
between each of these three isoforms is illustrated 
schematically in Figure 1. 

EXAMPLE II 

25 ISOLATION AND CHARACTERIZATION OF XR2 

The Kpnl/Sacl restriction fragment (503bp) 
including the DNA-binding domain of hRAR-alpha-encoding DNA 
[See Giguere et al., Nature 330: 624 (1987); and commonly 

30 assigned United States Patent Application Serial No. 
276,536, filed November 30, 1988; and European Patent 
Application Publication No. 0 325 849, all incorporated 
herein by reference] was nick-translated and used to screen 
a lambda-gtll human liver cDNA library [Kwofc et 

35 al.,Biochem. 2±: 556 (1985)] at low stringency. The 
hybridization mixture contained 35% formamide, IX 
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Denhardt's, 5X SSPE (IX SSPE=0.15 M NaCl, 10mM Na^O^ UnM 
EDTA) , 0.1% SDS f 10% dextran sulfate, 100 mg/ml denatured 

_salmon_ sperm— DNA_-and—10^—cpm— of — [ 32 P-]-labell d probe. — 

Duplicate nitrocellulose filters were hybridized for 16h at 
5 42 °C, washed once at 25 °C for 15 min with 2X SSC (IX 
SSC=0.15 M NaCl, 0.015 M sodium citrate) , 0.1% SDS and then 
washed twice at 55°C for 30 min. in 2X SSC, 0.1% SDS. The 
filters were autoradiographed for 3 days at -70 °C using an 
intensifying screen. 

10 

Positive clones were isolated, subcloned into 
pGEM vectors (Promega, Madison, Wisconsin, USA) , 
restriction mapped, and re-subcloned in . various sized 
restriction fragments into M13mpl8 and M13mpl9 sequencing 

15 vectors. DNA sequence was determined by the dideoxy method 
with Sequenase* sequencing kit (United States Biochemical, 
Cleveland, Ohio, USA) and analyzed by University of 
Wisconsin Genetics Computer Group programs [bevereux 
et al. , Nucl. Acids Res. 12, 387 (1984) ] . Several clones 

20 of a unique receptor-like sequence were identified, the 
longest of which was designated lambda-HLl-1 (also referred 
to herein as XR2 ) . 

The DNA sequence of the resulting clone is set 
25 forth as Sequence ID No. 7. 

EXAMPLE III 
ISOLATION AND CHARACTERIZATION OF XR4 

30 A clone which encodes a portion of the coding 

» 

sequence for XR4 was isolated from a mouse embryonic 
library by screening under low stringency conditions (as 
described above) • 

35 The library used was a lambda gtlO day 8.5 cDNA 

library having an approximate titer of 1,3 x I0 10 /ml 
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(derived fro- 8.5 day old embryonic material with as much 
of the amnion and extraembryonic tissues dissected away as 
possible) . This library wis prepared from poly A select d 
Ha (by oligo-dT priming) , Gubler s Hoffman cloning methods 
5 (Gene 2£: 2«3 (1983)], and cloned into the EcoRI site of 
lambda gtlO. 

The probe used was a mixture of radioactively 
labeled DNA derived from the DNA binding regions of the 
10 human alpha and beta retinoic acid receptors. 

Positive clones were isolated, subcloned into 
pGEM vectors (Promega, Madison, Wisconsin, USA), 
restriction mapped, and re-subcloned in various sxz d 

15 restriction fragments into M13m P 18 and M13mpl9 sequencing 
vectors. DNA sequence was determined by the dideoxy method 
with Sequenase" sequencing kit (United States Biochemical, 
Cleveland, Ohio, USA) and analyzed by University of 
Wisconsin Genetics Computer Group programs [Devereux 

20 et al., Nuc3 - * G id S Res. 22, 387 (1984)]. Several clones 
of a unique receptor-like sequence were identified, th 
longest of which was designated XR4. 

The DNA sequence of the resulting clone is set 
25 forth as Sequence ID No. 9. 

EXAMPLE IV 
ISOLATION AND CHARACTERIZATION OF XR5 



30 



35 



A clone which encodes a portion of the coding 
sequence for XR5 was isolated from a mouse embryonic 
library by screening under low stringency conditions (as 
described above) . 

The library used was the same lambda gtlO day 8.5 
cDNA library described in the preceding example. 
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Similarly, the probe used was . the same mixture of 
radioactively labeled DNA described in the preceding 
- -example. 

5 . Only one of the clones isolated corresponds to a * 

portion of the coding region for XR5. A 0.7 kb EcoRI 
fragment of this clone (designated as No. 11-17) was 
subcloned into the bluescript pksII-Vector. Partial 
sequence analysis of this insert fragment shows homology to 
the DNA binding domain of the retinoic acid receptors. 

The EcoRl-insert was used to rescreen a second 
library (a mouse lambda ZAPII day 6.5 cDNA library, 
prepared as described below) under high stringency 
conditions. A total of 21 phages were isolated and rescued 
into the psk-vector. Partial sequencing allowed inserts 
from 13 of these phages to be identified as having 
sequences which overlap with XR5 11-17. The clone with the 
longest single EcoRI-insert was sequenced, revealing an 
open reading frame of 556 amino acids. This sequence was 
extended further upstream by 9bp from the furthest 
5* -reaching clone. 

The DNA sequence of the resulting clone is set 
forth as Sequence ID No. 11. 

The day 6.5 cDNA library, derived from 6.5 day 
old mouse embryonic material was prepared from poly A* 
selected RNA (by oligo-dT priming) , and cloned into the 
EcoRI site of lambda gtlO. 

EXAMPLE V 

ISOLATION AND CHARACTERIZATION OF XR79 

The 550 bp BamHI restriction fragment, including 
the DNA-binding domain of mouse RAR-beta-encoding DNA (See 
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„ »„, et al Proc. Natl. Acad. Sci. Si* 8289 (ISM) » 

selected Drosophila genomic library a 5 ' _ 

5 restricted, at low stringency. The 

contained 35* formamide, IX Denhardfs, SX SSPE <« 
SSPE-0.15 H Had, 10*H Ha^O, I* EDTA) , 0.1% SDS 10% 
ceLan sulfate. 100 mg/ml denatured sal.cn speo . DHA and 
10 ' cpm of ( !2 P,-labelled probe. Duplicate nitrocellulose 
10 cpm ... ,, h A2 °C washed once at 

10 filters were hybridized for 16h at 42 C, wasn 

L-C for 15 .in With 2X SSC (IX SSO0.15 « NaCl 0 015 M 
sodiua citrate) , 0.1* SOS an, L then v as h ea^ ^for 

• 30 .in. in 2X SSC ^\ ^ ^ J^ sity ^ g 
autoradiographed for. 3 days at -70 C using an 

15 screen. 

Mter several rounds of screening, a pure 
positive clone saving an insert of about 3.5 Kb is obtained 
Lorn the Drosophila genomic library. This genomic clone 

20 was then used to screen a Drosophila imagii*! disc lambda 
£o « library (obtained from Dr. Charles Zufcer; seeDHA 
Zoning, A practical approach, Vol I and H, D. H. Glover 
ed!^ Press (1985,,. Hybridization conditions comprised 
a hybridization mixture containing 50% formamide IX 

25 Denhardfs, 5X' SSPE, 0.1% SDS, 100 „fl* denatured salmon 
sperm DMA and 10* cpm of C*P,-lapelled probe Dupl cate 
nitrocellulose filters were hybridized for 16h £ 4 C 
vashed once at 25-c for 15 .in with 2X SSC (IX « 
HaCl, 0.015 M sodiu. citrate), 0.1% SDS and then washed 

30 twice at 55-c for 30 -in. in 2X SSC, 0.1% SDS. The filters 
^ autoradiography for 3 days at -70-c using an 
intensifying screen. 

sequence analysis of the positive cDHA clone 
35 indicates that this clone ncodes another nove! 

tte steroid/thyroid superfamily of receptors (designated 
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••XR79", a 2.5 kb cDNA) . See Sequence ID No. 13 for the DNA 
sequence of the resulting clone. 

The 2.5 kb cDNA encoding XR79 was nick-translat d 
5 and used as a probe for a nitrocellulose filter containing 
size-fractionated total RNA, isolated by standard methods 
from Drosophila melanogaster of different developmental 
stages. The probe hybridized to a 2.5 kb transcript which 
was present in RNA throughout development. The levels were 

10 highest in RNA from 0-3 hour old embryos and lowest in 
RNA from second instar larvae. The same 2.5 kb cDNA was 
nick translated using biotinylated nucleotides and used as 
a probe for in situ sybridization to whole Drosophila 
embryos [Tautz and Pfeifle, Chromosoma 98: 81-85 (1989)]. 

15 The RNA distribution appeared relatively uniform at 
different stages of embryogenesis. 

EXAMPLE VI 

SEQUENCE COMPARISONS OF INVENTION RECEPTORS 
20 WITH hRARa, hTRB, hGR, AND hRXRa 

Amino acid sequences of XR1, hRAR-alpha (human 
retinoic acid receptor-alpha) , hTR-beta (human thyroid 
hormone receptor-beta) , hGR (human glucocorticoid 

25 receptor) , and hRXR-alpha (human retinoid receptor-alpha) 
were aligned using the University of' Wisconsin Genetics 
Computer Group program "Bestfit" (Devereux et al., supra). 
The percentage of amino acid identity between RX2 and the 
other receptors, i.e., in the 66 - 68 amino acid DNA 

30 binding domains and the ligand-binding domains, are 
summarized in Table 1 as percent amino acid identity. 
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TABLE 1 

Percent amino acid identity between 
receptor a? (verhtl9) and hRARa, TRB, hGR , and hRXRa 



Comparison 
7-^neptor 



pprcent a™*"" acid identity. 
overall N-term* gNA-Bp 2 T,iqand-pD 3 



hGR 
10 hTRB 
'hRARa 
hRXRa 



15 



18 
31 
32 
29 



21 
14 
25 
15 



45 
59 
68 
65 



20 
30 
27 
22 



^•N-term" = amino terminal domain 
2 «DNA-BD» = receptor DNA binding domain 
3 «Ligand-BD'« = receptor ligand binding domain 



20 



25 



Similarly, the amino acid sequences of invention 
receptors XR2, XR4, XR5, and XR79 were compared with human 
RAR-alpha (hRARa) , human TR-beta (hTRfi) , human 
glucocorticoid (hGR) and human RXR-alpha (hRXRa) . As done 
in Table 1, the percentage of amino acid identity between 
the invention receptors and the other receptors are 
summarized in Tables 2 - 5 f respectively. 



30 



TABLE 2 

Percent amino acid identity between 
receptor XR2 and hRARa, TRB, hGR, and hRXRa 



Comparison 
receptor 



Pprnent g mino acid identity 



overall N-term 1 pNA-BD 2 T,iqand-BP 



35 hGR 
hTRB 
hRARa 
hRXRa 

40 



24 
31 
33 
27 



21 
19 
21 
19 



50 
56 
55 
52 



20 
29 
32 
23 



1 UN-term" = amino terminal domain 
2 "DNA-BD" = receptor DNA binding domain 
3 "Ligand-BD« = receptor ligand binding domain 
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TABLE 3 

Percent amino acid identity between 
receptor XR4 and hRARa, TR6, hGR, and hRXRa 



Comparison 
r?ceptor 



Percent a mino acid identity 



Overall N-term DNA—BD Ljqand-BD 5 



hGR 
10 hTRfi 
hRARa 
hRXRa 



15 



25 
31 
32 
33 



24 
21 
22 
24 



48 
58 
62 
62 



21 
27 
29 
28 



"N-term M = amino terminal domain 
"DNA— BD M = receptor DNA binding domain 
M Ligand-BD" = receptor ligand binding domain 



20 



TABLE 4 

Percent amino acid identity between 
receptor XR5 and hRARa, TRS, hGR f and hRXRa 



Comparison 
25 receptor 



Percent amino acid identity 

Overall N-term 1 DNA-BD 2 Liqand-BD 3 



hGR 
hTRB 
hRARa 
30 hRXRa 



20 
24 
27 
29 



20 
14 
19 
17 



44 
52 
59 
61 



20 
22 
19 
27 



35 



"N-term" = amino terminal domain 
3 "DNA-BD M = receptor DNA binding domain 
"Ligand-BD M = receptor ligand binding domain 
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TABLE 5 . 
Percent amino acid identity between 
receptor XR79 and hRARa, TRS, hGR, and hRXRa- 

5 Percent amino acid identity 

Comparison « 2 . , 3 

receptor overall N-term DNA-BD Ligand-BD 

hGR 18 22 50 20 

10 hTRB 28 22 55 20 

hRARa 24 14 59 18 

hRXRa 33 20 65 24 

t,, N-term" = amino terminal domain 
15 2 "DNA.-BD" = receptor DNA binding domain 

3w Ligand-BD n = receptor ligand binding domain 



While the invention has been described in detail 
20 with reference to certain preferred embodiments thereof, it 
will be understood that modifications and variations are 
within the spirit and scope of that which is described and 
claimed. 
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CTTMMAttV OF SEQU ENCES 

Sequence — ID — No-. — 1 -is- -a— nucleotide- sequence — 

encoding novel receptor of the present invention designated 
5 as "hXRl". 

Sequence ID No. 2 is the amino acid sequence 
deduced from the nucleotide sequence set forth in Sequence 
ID No. 1 (variously referred to herein as receptor M XR1", 
10 "hXRl", "hXRl. pep" or M verHT19 .pep") . 

Sequence ID No. 3 is a nucleotide sequence 
encoding the amino-terminal portion of the novel receptor - 
of the present invention designated as "hXRlprime". 

15 

Sequence ID No. 4 is the amino acid sequence 
deduced from the nucleotide sequence set forth in Sequence 
ID No. 3 (variously referred to herein as receptor 
"XRlprime", "hXRlprime", "hXRlprime. pep" or "verHT3.pep") . 

20 

Sequence ID No. 5 is a nucleotide sequence 
encoding the amino-terminal portion of the novel receptor 
of the present invention designated as "hXRlprim2". 

25 sequence ID No. 6 is the amino acid sequence 

deduced from the nucleotide sequence set forth in Sequence 
ID No. 5 (variously referred to herein as receptor 
"XRlprim2 M , "hXRlprim2" , "hXRlprim2.pep , « or "verHr5.pep") . 

30 Sequence ID No. 7 is a nucleotide sequence 

encoding the novel receptor of the present invention 
designated as "hXR2". 

Sequence ID No. 8 is the amino acid sequence 
35 deduced from the nucleotide sequence set forth in Sequence 
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ID No. 7 (variously referred to herein as receptor "XR2", 
"hXR2 M or »hXR2.pep") . 

Sequence ID No. 9 is a nucleotide sequence . 
5 encoding novel receptor of the present invention referred 
to herein as ,l mXR4". 

Sequence ID No. 10 is the amino acid sequence 
deduced from the nucleotide sequence of Sequence ID No. 9 
10 (variously referred to herein as receptor "XR4", "mXR4 " or 
"mXR4.pep") . 

Sequence ID No. 11 is the nucleotide sequence 
encoding the novel receptor of the present invention 
15 referred to as "mXR5". 

Sequence ID No. 12 is the amino acid sequence 
deduced from the nucleotide sequence of Sequence ID No. 11 
(variously referred to herein as receptor "XR5" , "mXR5" or 
20 "mXRS.pep") . 

Sequence ID No. 13 is the nucleotide sequence 
encoding the novel receptor of the present invention 
referred to as "dXR79». 

25 

Sequence ID No. 14 is the amino acid sequence 
deduced from the nucleotide sequence of Sequence ID No. 13 
(variously referred to herein as . "XR79", M dXR79" or 
M dXR79.pep") . 

30 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

APPLICANT : _ EVANS ^hTDv,_RONALD-M^---- 

v KANGELSDORF Ph.D. , DAVID J. 

ONG Ms * , ESTELITA S. 
ORO Ph.D. , ANTHONY E. 
BORGMEYER Ph.D.. UWE K. 
GIGUERE Ph.D. , VINCENT NMN 
YAO Mr.. TSO-PANG NMN 

(11) TITLE OF INVENTION: NOVEL RECEPTORS 

(111) NUMBER OF SEQUENCES: 14 

(lv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Pretty. Schroeder, Brueeeeaann 6. Clark 

(B) STREET: 444 So. Flower St., Suite 2000 

(C) CITY: Los Angeles 

(D) STATE: CA 

(E) COUNTRY: US 

(F) ZIP: 90071-2921 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS/MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vl) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(vlll) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Re leer Ph.D., Stephen E. 

(B) REGISTRATION NUMBER: 31192 

(C) REFERENCE/DOCKET NUMBER: P31 8936 

(lx) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (619) 535-9001 

(B) TELEFAX: (619) 535-8949 

(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1952 base pairs 

(B) TYPE: nucleic add 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: cDNA 

(vll) IMMEDIATE SOURCE • _ cc _ 

(B) CLONE: XR1 (VERHT19.SEQ) 

(lx) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 79.. 1725 
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(lx) FEATURE: 

(A) NAME/KEY: nise feature 

(B) LOCATION: 349.71952 

(D) OTHER INFORMATION : /product- "Carboxy terminal portl n 
of XR1 variant verht3" 



(ix) FEATURE: 

(A) NAME/KEY: nlsc feature 

(B) LOCATION: 352.71952 

(D) OTHER INFORMATION: /product- -Carboxy terminal portion 
of XRl variant verhr5" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GAATTCGGCC ACTCCATACT ACACTCGCGC AAACCACACC CCCACTTTCT CGAGGCACAT 60 

GGGTAACCAC CAAAACGC ATG AAT CAC GGC CCC CCA GCA CAC ACT CAC TTA 111 

Met Asn Glu Cly Ala Pro Gly Asp Ser Asp Leu 
1 5 10 

GAG ACT GAG CCA AGA CTG CCG TGC TCA ATC ATG GGT CAT TGT CTT CCA 159 
Glu Thr Glu Ala Are Val Pro Trp Ser lie Met Cly His Cys Leu Are 

. 15 20 25 

ACT CCA CAC CCC AGA ATC TCT CCC ACA CCC ACA CCT CCA GGT CAA CCA 207 
Thr Gly Gin Ala Are Met Ser Ala Thr Pro Thr Pro Ala Cly Glu Cly 
30 35 40 

CCC AGA AGC TCT TCA ACC TCT ACC TCC CTG ACC AGG CTG TTC TCG TCT 255 
Ala Are Ser Ser Ser Thr Cys Ser Ser Leu Ser Are Leu Phe Trp Ser 
45 50 55 

CAA CTT GAG CAC ATA AAC TGC CAT GGA CCC ACA GCC AAG AAC TTT ATT 303 
Gin Leu Glu His He Asn Trp Asp Gly Ala Thr Ala Lys Asn Phe He 
60 65 70 75 

AAT TTA AGG CAC TTC TTC TCT TTT CTG CTC CCT CCA TTC AGA AAA CCT 351 
Asn Leu Are Glu Phe Phe Ser Phe Leu Leu Pro Ala Leu Are Lys Ala 
80 85 90 

CAA ATT CAA ATT ATT CCA TCC AAC ATC TCT CCA CAC AAA TCA TCA CCA 399 
Gin He Glu He He Pro Cys Lya He Cys Gly Asp Lys Ser Ser Cly 
95 100 105 

ATC CAT TAT GGT CTC ATT ACA TGT CAA CCC TCC AAG CCC TTT TTC ACC 447 
He His TVr Gly Val He Thr Cys Clu Cly Cys Lys Gig Phe Phe Arg 

AGA ACT CAG CAA AGC AAT GCC ACC TAC TCC TGT CCT CCT CAG AAG AAC 495 
Arg Ser Gin Gin Ser Asn Ala Thr Tyr Ser Cys Pro Are Cln Lys Asn 
125 130 135 

TGT TTC ATT CAT CCA ACC ACT AGA AAC CCC TGC CAA CAC TCT CCA TTA 543 
Cys Leu He Asp Arg Thr Ser Arg Asn Arg Cys Cln His Cys Arg Leu 
140 145 150 J 155 

CAG AAA TGC CTT GCC GTA GGC ATC TCT CCA CAT CCT GTA AAA TTT GCC 591 
Gin Lys Cys Leu Ala Val Cly Met Ser Are Asp Ala Val Lys Phe Cly 
160 165 170 

CGA ATG TCA AAA AAG CAG AGA GAC AGC TTC TAT GCA CAA GTA CAG AAA 639 
Arg Met Ser Lys Lys Gin Arg Asp Ser Leu Tyr Ala Glu Val Cln Lys 
175 " 180 185 
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CAC CCC ATC CAC CAG CAC CAG CGC CAC CAC CAG CAC CAG CC7 CCA GAG 687 
His Are Met Gin Cln Gin Gin Are Asp His Gin Cln Gin Pro Cly Clu 
6 190 195 200 

GCT GAG CCG CTG ACG CCC ACC TAC AAC ATC TCC CCC AAC GGG CTC ACG 735 
Ala Glu Pro Leu Thr Pro Thr Tyr Asn lie Ser Ala Asn Gly Leu Thr 

205 210— 215 

CAA CTT CAC GAC CAC CTC ACT AAC TAC ATT CAC GGG CAC ACC CCT GAG 783 
Glu Leu His Asp Asp Leu Ser Asn Tyr He Aso Cly His Thr Pro Clu 
220 225 230 235 

GGG ACT AAC CCA GAC TCC CCC CTC AGC AGC TTC TAC CTG GAC ATA CAG 831 
Gly Ser Lys Ala Asp Ser Ala Val Ser Ser Phe Tyr Leu Asp He Gin 
3 240 245 250 

CCT TCC CCA GAC CAG TCA GGT CTT GAT ATC AAT CCA ATC AAA CCA CAA 879 
Pro Ser Pro Asp Gin Ser Gly Leu Asp He Asn Gly He Lys Pro Glu 
255 260 265 

CCA ATA TCT GAC TAC ACA CCA CCA TCA CGC TTC TTT CCC TAC TCT TCC 927 
Pro He Cys Asp Tyr Thr Pro Ala Ser Gly Phe Phe Pro Tyr Cys Ser 
270 275 280 

TTC ACC AAC CGC CAC ACT TCC CCA ACT CTC TCC ATC CCA CAA TTA CAA 975 
Phe Thr Asn Cly Clu Thr Ser Pro Thr Val Ser Met Ala Glu Leu Glu 
285 290 295 

CAC CTT CCA CAG AAT ATA TCT AAA TCC CAT CTG GAA ACC TGC CAA TAC 1023 
His Leu Ala Cln Asn He Ser Lys Ser His Leu Clu Thr Cys Cln Tyr 
300 305 310 315 

TTC ACA GAA CAC CTC CAC CAG ATA ACG TGG CAC ACC TTT TTA CAC CAA 1071 
Leu Are Clu Glu Leu Gin Cln He Thr Trp Cln Thr Phe Leu Cln Glu 
320 325 330 

GAA ATT CAG AAC TAT CAA AAC AAC CAC CCG GAG CTC ATC TGG CAA TTC 1119 
Glu He Glu Asn Tyr Gin Asn Lys Cln Arg Clu Val Met Trp Cln Leu 
335 340 345 

TCT CCC ATC AAA ATT ACA GAA CCT ATA CAC TAT CTC CTC CAG TTT GCC 1167 
Cys Ala He Lys He Thr Clu Ala He Cln Tyr Val Val Clu Phe Ala 
350 355 360 

AAA CCC ATT GAT CCA TTT ATG CAA CTC TCT CAA AAT CAT CAA ATT CTC 1215 
Lys Are He Asp Cly Phe Met Clu Leu Cys Cln Asn Asp Cln He Val 
365 370 375 

CTT CTA AAA CCA CCT TCT CTA GAG CTC CTC TTT ATC ACA ATC TCC CCT 1263 
Leu Leu Lys Ala Cly Ser Leu Clu Val Val Phe He Arg Met Cys Arg 
380 385 390 395 

CCC TTT CAC TCT CAC AAC AAC ACC CTC TAC TTT GAT CCG AAC TAT CCC 1311 
Ala Phe Asp Ser Cln Asn Asn Thr Val Tyr Phe Asp Cly Lys Tyr Ala 
400 405 410 

AGC CCC GAC CTC TTC AAA TCC TTA GGT TCT CAA GAC TTT ATT AGC TTT 1359 
Ser Pro Asp Val Phe Lys Ser Leu Gly Cys Clu Asp Phe He Ser Phe 
415 420 425 

CTG TTT CAA TTT CCA AAC ACT TTA TCT TCT ATC CAC CTG ACT CAA CAT 1407 
Val Phe Clu Phe Cly Lys S r Leu Cys S r Met His Leu Thr Clu Asp 
430 435 440 

CAA ATT CCA TTA TTT TCT CCA TTT CTA CTC ATC TCA CCA CAT CCC TCA 1455 
Clu He Ala Leu Phe Ser Ala Phe Val Leu Met Ser Ala Asp Arg Ser 

445 450 455 
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TCC CTC CAA GAA AAG CTA AAA ATT GAA AAA CTC CAA CAG AAA ATT CAG 1503 
Trp Leu Gin Clu Lys Val Lys He Glu Lys Leu Gin Gin Lys II Gin 
460 465 470 475 

CTA CCT CTT CAA CAC CTC CTA CAG AAG AAT CAC CCA GAA GAT CCA ATA 1551 
_^_Ala_Leu Gin Hls_ Val_Leu Gin Lys_AsnHis Arg Glu_Asp Cly„Il " 
480" 485 • 490 

CTA ACA AAG TTA ATA TCC AAG CTC TCT ACA TTA AGA GCC TTA TGT CCA 1599 
Leu Thr Lys Leu He Cys Lys Val Ser Thr Leu Arg Ala Leu Cys Gly 
495 500 505 

CGA CAT ACA GAA AAG CTA ATG CCA TTT AAA CCA ATA TAC CCA GAC ATT 1647 
Arg His Thr Glu Lys Leu Met Ala Phe Lys Ala He Tyr Fro Asp He 
510 515 520 

CTG CGA CTT CAT TTT CCT CCA TTA TAC AAG CAC TTC TTC ACT TCA GAA 1695 
Val Arg Leu His Phe Pro Pro Leu Tyr Lys Clu Leu Phe Thr Ser Clu 
525 530 " 535 

TTT GAG CCA CCA ATG CAA ATT CAT GCC TAAATCTTAT CACCTAACCA 1742 
Phe Glu Pro Ala Met Gin He Asp Gly 
540 545 

CTTCTAGAAT GTCTGAAGTA CAAACATGAA AAACAAACAA AAAAATTAAC CCACACACTT 1802 " 

TATATGGCCC TCCACAGACC TGGAGCCCCA CACACTGCAC AT CTTTT CCT GATCGGGGTC 1862 

ACCCAAAGCA GCCGAAACAA TGAAAACAAA TAAAGTTGAA CTTGTTTTTC TCAAAAAAAA 1922 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1952 

(2) INFORKATION FOR SEQ ID N0:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 548 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Met Asn Clu Gly Ala Pro Gly Asp Ser Asp Leu Clu Thr Glu Ala Arg 
1 5 10 15 

Val Pro Trp Ser He Met Cly His Cys Leu Arg Thr Gly Gin Ala Arg 
20 25 30 

Met Ser Ala Thr Pro Thr Pro Ala Gly Glu Gly Ala Are Ser Ser Ser 
35 40 45 

Thr Cys Ser Ser Leu Ser Arg Leu Phe Trp Ser Gin Leu Clu His He 
50 55 60 

Asn Trp Asp Gly Ala Thr Ala Lys Asn Phe He Asn Leu Arg Clu Phe 
65 70 75 80 

Phe Ser Phe Leu Leu Pro Ala Leu Arg Lys Ala Gin He Glu He He 
85 90 95 

Pro Cys Lys He Cys Gly Asp Lys Ser Ser Cly He His Tyr Cly Val 



tyr 

ioo " - 105 - Ho 

He Thr Cys Clu Cly Cys Lys Gly Phe Phe Arg Arg Ser Gin Gin Ser 
115 120 125 
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Asn Ala Thr Tyr Ser Cys Pr Arg Cln Lys Asn Cys Leu lie Asp Arg 
130 135 140 

Thr . Ser Arg Asn Arg Cys Cln His Cys Arg Leu Gin Lys Cys Leu Ala 
145 150 155 160 

-Val-Gly-Met S r~Arg Asp-Ala Val-Lys Ph -Cly-Arg-He t- Ser-Lys-Lys — ; -— 

3 165 170 175 



Gin Arg 


Asp 


Ser 
180 


Leu 


Tyr 


Ala 


Glu 


Val 
185 


Gin 


Lys 


His 


Arg 


Met 
190 


Gin Gin 


Gin Gin 


Are 

iff 


Asn 


His 


Cln 


Cln 


Gin 
200 


Pro 


Gly 


Glu 


Ala 


Glu 
205 


Pro 


Leu Thr 


* Xiia> 

210 


Tvr 


Asn 


He 

X X » 


Ser 


Ala 
215 


Asn 


Gly 


Leu 


Thr 


Glu 
220 


Leu 


His 


Asp Asp 


T oil Car 

225 


Asn 

/Vail 


Tvr 


He 


Asn 
230 


C1y 


His 


Thr 


Pro 


Glu 
235 


Cly 


Ser 


Lys 


Ala Asp 
240 


w C l> nla 


Val 


Ser 


Ser 
245 


Phe 


Tvr 


Leu 


Asp 


He 

250 


Cln 


Pro 


Ser 


Pro 


Asp 
255 


Gin 


Ser Gly 


T »ii 


Afin 

260 


lie 


Asn 


ClY 


He 


Lvs 
265 


Pro 


Glu 


Pro 


He 


Cys 
270 


Asp Tyr 


Thr Pro 


Ala 

275 


Ser 


ClY 


Phe 


Phe 


Pro 
280 


Tyr 


Cys 


Ser 


Phe 


Thr 
285 


Asn 


Cly 


Glu 


Thr Ser 
290 


Pro 


Thr 

• 


Val 


Ser 


Met 
295 


Ala 


Glu 


Leu 


Glu 


His 
300 


Leu 


Ala 


Gin 


Asn 


He Ser 
305 


Lys 

< 


Ser 


His 


Leu 
310 


Glu 


Thr 


Cys 


Gin 


Tyr 
315 


Leu 


Arg 


Clu 


Glu 


Leu 
320 


Cln Cln 


He 


Thr 


Trp 

325 


Gin 


Thr 


Phe 


Leu 


Gin 
330 


Glu 


Glu 


He 


Glu 


Asn 
335 


Tyr 


Gin Asn 


Lvs 


Cln 
340 


Are 


Glu 


Val 


Met 


Trp 
345 


Gin 


Leu 


Cy« 


Ala 


He 
350 


Lys 


He 


Thr Glu 


Ala 

355 


He 


Gin 


Tyr 


Val 


Val 
360 


Glu 


Phe 


Ala 


Lys 


Arg 
365 


lie 


Asp 


Cly 


Phe Met 
370 


Glu 


Leu 


Cys 


Cln 


Asn 
375 


Asp 

* 


Gin 


lie 


Val 


Leu 
380 


Leu 


Lys 


Ala 


Gly 


Ser Leu 
385 


Glu 


Val 


Val 


Phe 
390 


He 


Arg 

o 


Met 


Cys 


Arg 
395 


Ala 


Phe 


Asp 


Ser 


Gin 
400 


Asn Asn 


Tnr 


vai 


Tvr 
405 


rne 


ASp 


Civ 
uiy 


i-ys 


Tyr 
410 


Ala 


Ser 


Pro 


Asp 


Val 

415 


Phe 


Lys Ser 


Leu 


Cly 
420 


Cys 


Clu 


Asp 


Phe 


He 
425 


Ser 


Phe 


Val 


Phe 


Clu 
430 


Phe 


Cly 


Lys Ser 


Leu 
435 


Cys 


Ser 


Met 


His 


Leu 
440 


Thr 


Glu 


Asp 


Glu 


He 
445 


Ala 


Leu 


Phe 


Ser Ala 
A50 


Phe 


Val 


Leu 


Met 


Ser 
455 


Ala 


Asp 


Arg 


Ser 


Trp 
460 


Leu 


Gin 


Glu 


Lys 


Val Lys 
465 


He 


Glu 


Lys 


Leu 
470 


Gin 


Cln 


Lys 


He 


Gin 
475 


Leu 


Ala 


Leu 


Cln 


His 
480 
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Val Leu Cln Lys Asn His Arg Clu Asp Gig lie Leu Thr Lys Leu He 



485 



Cys Lys Val Ser Thr Leu Arg Ala Leu Cys Gly Arg His Thr Glu Lys 
' J 500 505 510 



Leu Met Ala Phe Lys Ala lie Tvr Pfb-Asp Tie Val Arg Leu His Phe 
515 520 525 

Pro Pro Leu Tyr Lys Clu Leu Phe Thr Ser Clu Phe Clu Pro Ala Met 
530 535 540 

Gin He Asp Cly 
545 

(2) INFORMATION FOR SEQ ID N0:3: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 386 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(vii) ^gJ^SOURCE^ TERHINAL MmolJ QF xRipjOKZ (VERHT3.SEQ) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 90.. 386 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CCATCTCTCT GATCACCTTG CACTCCATAC TACACTGCGG CAAACCACAC CCCCAGTTTC 60 



TGCAGGCAGA TCGGTAACCA CCAAAACGC 


ATG 
Met 
1 


AAT 
Asn 


CAG CGG 
Clu Gly 


CCC 
Ala 
5 


CCA 
Pro 


GGA 
Cly 


GAC 
Asp 


113 


ACT 
Ser 


GAC 
Asp 
10 


TTA 
Leu 


GAC 
Glu 


ACT 
Thr 


GAC 
Glu 


CCA 
Ala 
15 


AGA 
Arg 


GIG 
Val 


CCC 
Pro 


TCC TCA 
Trp Ser 
20 


ATC 
He 


ATC 
Met 


CCT 
Gly 


CAT 
His 


161 


TGT 

"& 


CTT 
Leu 


CCA 
Arg 


ACT 
Thr 


CGA 
Cly 


CAG 
Cln 
30 


CCC 
Ala 


AGA 
Arg 


ATG 
Met 


TCT 
Ser 


CCC ACA 
Ala Thr 
35 


CCC 
Pro 


ACA 
Thr 


CCT 
Pro 


CCA 
Ala 
40 


209 


GGT 
Gly 


GAA 
Glu 


CGA 
Gly 


CCC 
Ala 


AGA 
Arg 
45 


AGG 
Arg 


GAT 
Asp 


GAA 
Glu 


CTT 
Leu 


TTT 
Phe 
50 


CCC ATT 
Gly He 


CTC 
Leu 


CAA 
Gin 


ATA 
He 
55 


CTC 
Leu 


257 


CAT 
His 


CAG 
Gin 


TCT 
Cys 


AIC 
He 
60 


CTG 
Leu 


TCT 
Ser 


TCA 
Ser 


GGT 
Gly 


CAT 
Asp 
65 


CCT 
Ala 


TTT CTT 
Phe Val 


CTT 
Leu 


ACT 
Thr 
70 


GGC 
Cly 


CTC 
Val 


305 


TCT 

Cys 


TGT 

Cys 


TCC 
Ser 
75 


TCC 
Trp 


AGG 
Arg 


CAG 
Gin 


AAT 
Asn 


CCC 
Cly 
80 


AAC 

Lys 


CCA 
Pro 


CCA TAT 
Pro Tyr 


TCA 
Ser 
85 


CAA 
Gin 


AAC 
Lys 


GAA 
Glu 


353 


CAT 
Asp 


AAG 
Lys 
90 


CAA 
Clu 


CIA 
Val 


CAA 
Cln 


ACT 
Thr 


CCA 
Cly 
95 


TAC 
Tyr 


ATG 
Met 


AAT 
Asn 


CCT 
Ala 










386 
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(2) INFORMATION FOR SEQ ID NO: 4: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 99 anino acids 

(B) TYPE: aaino acid 
(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: protein " 

(xl) SEQUENCE DESCRIPTION: SEQ ID N0:4: 

Met Asn Glu Cly Ala Pro Gly Asp Ser Asp Leu Glu Thr Glu Ala Arg 
1 5 10 15 

Val Pro Trp Ser lie Met Cly His Cys Leu Arg Thr Gly Gin Ala Arg 
20 25 30 

Met Ser Ala Thr Pro Thr Pro Ala Gly Glu Gly Ala Are Arg Asp Glu 

35 AO 45 

Leu Phe Gly He Leu Gin He Leu His Gin Cys He Leu Ser Ser Gly 
50 55 60 

Asp Ala Phe Val Leu Thr Gly Val Cys Cys Ser Trp Arg Gin Asn Gly 
65 70 75 80 

Lys Pro Pro Tyr Ser Gin Lys Clu Asp Lys Glu Val Gin Thr Cly Tyr 
85 90 95 

Mec Asn Ala 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 300 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(II) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: AMINO TERMINAL PORTION OF XR1PRIM2 (VERHR5 . SEQ) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 103.. 300 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

G l llliiii l TTTTTTTGC T ACCATACACT TGCTCTGAAA ACACAAGATA GACGGAGTCT 60 

CCGAGCTCGC CATCTCCAGC CATCTCTACA TTGGGAAAAA AC ATC GAG TCA GCT 114 

Met Glu Ser Ala 

1 . 

CCC CCA ACG GAG ACC CCG CTC AAC CAG GAA TCC CCC CCC CCC CAC CCC 162 
Pro Ala Arg Clu Thr Pro Leu Asn Gin Glu Ser Ala Ala Pro Asp Pro 
5 10 15 20 

CCC GCC AGC CAC CCA CCC ACC ACC GGC CCG CAC GCG CCC CCC CCC TCC 210 
Ala Ala Ser Clu Pro Cly Ser Ser Cly Ala Asp Ala Ala Ala Cly Ser 
25 30 35 
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CCC AAC AGC GAG CCC CCT GCC CCC GTG CGC AGA CAG AGC TAT TCC ACC 258 
Arg Lys Ser Clu Pro Pro Ala Pro Val Arg Arg Gin Ser Tyr Ser Ser 
° J 40 45 J" 



ACC AGC ACA CCT ATC TCA CTA ACC AAG AAC ACA CAT ACA TCT 
Jhr Sftr_Ar| Cly_ lie Ser Val Thr Lys lys Thr His Thr Ser 



300 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 amino acids 

(B) TYPE: amino add 
(D) TOPOLOGY: linear 

(II) MOLECULE TYPE: protein 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met ciu Ser Ala Pro Ala Arg Glu Thr Pro Leu Asn Gin Glu Ser Ala 
1 5 10 15 

Ala Pro Asp Pro Ala Ala Ser Clu Pro Gly Ser Ser Gly Ala Asp Ala 

20 25 30 ^ 

Ala Ala Gly Ser Arg Lys Ser Clu Pro Pro Ala Pro Val Arg Arg Gin 
35 40 45 

Ser Tyr Ser Ser Thr Ser Arg Cly He Ser Val Thr Lys Lys Thr His 
50 55 60 

Thr Ser 
65 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1659 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: XR2 (XR2.SEG) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 148.. 1470 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

GATATCCGTC ACATCATTCC CTGACTCCAC TCCAAAAAGC TGTCCCCAGA GCACGAGGCC 60 

AATGACAGCT CCCAGCGCAC TCATCTTGAC TCCTCTTGCC TGCGCATTTG GaCAGTCCCT 120 

TGGTAAICAC CACCCCTCCA CAAAGAC ATG TCC TTG TGG CTG CGG GCC CCT 171 

Met Ser Leu Trp Leu Gly Ala Pr 
1 5 

GTG CCT CAC ATT CCT CCT GAC TCT CCC CTG GAG CTC TGG AAC CCA CCC 219 
Val Pro Asp He Pro Pro Asp Ser Ala Val Clu Leu Trp Lys Pro Gly 
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CCA CAC GAT CCA AGC AGC CAC CCC CAC CCA CGC ACC ACC TCC ATC CTC 267 
Ala £n S Ser S^r Gin Ala Gin Cly Cl ? Ser Ser Cys II Uu 

AGA GAG GAA CCC AGO ATG CCC CAC TCT GCT CCG CGT ACT CCA CAC CCC 315 
A?g Clu Git Ala Arg Met Pre His Ser Ala Cly Cly Thr Ala Clu ^ Pro , _ 

ACA CCC CTC CTC ACC AGC CCA CAG CCC CCT TCA GAA CCC ACA CAC ATC 363 
Thr Ala Leu Leu Thr Arg Ala Clu Pro Pro Ser Glu Pro Thr Clu lie 
60 65 70 

CCT CCA CAA AAC CGC AAA AAC GGG CCA CCC CCC AAA ATC CTC CCC AAC 411 
1% $?o Gin i£ Arg Lys Lys Cg Pro Ala Pro Lys Met Leu Cly Asn 

GAG CTA TGC ACC CTG TCT CCC CAC AAC CCC TCG CGC TTC CAC TAC AAT 459 
Glu Cys Ser Val Cys Clj Asp Lys Ala Ser Gig Phe His Tyr Asn 

GTT CTG AGC TCC GAG CCC TCC AAC CCA TTC TTC CCC CCC ACC CTC ATC 507 
Val Leu Ser Cys Clu Gig Cys Lys Cly Phe Phe Arg Arg Ser Val lie 



105 

AAC CCA GCG CAC TAC ATC TCC CAC ACT CGC CGC CAC TCC CCC ATG CAC ^555 . 

lys Gly Ala His Tyr He Cys His Ser Gig Gly His Cys Pro Met Asp 

ACC TAC ATC CCT CCC AAC TCC CAG GAG TCT CCC CTT CCC AAA TGC CGT 603 
Thr Tyx Met Arg Arg Lys Cys Gin Clu Cys Arg Leu Arg Lys Cys Arg 

CAG GCT CGC ATG CGC GAG CAG TCT CTC CTC TCA GAA CAA CAG ATC CCC 651 
ClnAla Gl£ Met Arg Clu Glu Cys Val Leu Ser Glu Clu Cln lie Arg 

CTG AAC AAA CTC AAC CCG CAA GAG CAG GAA CAC CCT CAT CCC ACA TCC 699 
Leu L^s Lys Leu Lys Arg Cln Glu Glu Glu Gin Ala His Ala Thr Ser 

TTC CCC CCC AGC CCT TCC TCA CCC CCC CAA ATC CTC CCC CAG CTC ACC 747 
Leu Pro Pro Arg Arg Ser Ser Pro Pro Gin lie Leu Pro Gin Uu Ser 
185 190 195 

CCG GAA CAA CTC CGC ATG ATC CAG AAC CTC CTC CCT CCC CAG CAA CAC 795 
Pro Glu Gin Leu Cly Met He Clu Lys Leu Val Ala Ala Cln Gin Cln 
205 210 ** 3 

TCT AAC CCC CCC TCC TTT TCT CAC CGC CTT CCA CTC ACC CCT TCC CCC 843 
lyl aJS Arg Arg Ser Phe Ser Asp Arg Uu Arg Val Thr Pro Trp Pro 

ATC CCA CCA CAT CCC CAT ACC CGC GAG CCC CGT CAG CAC CCC TTT CCC 891 
Ala* £o* Xt; Pro His Ser Arg Clu Ala Arg Cln Cln Arg Phe Ala 
235 240 Z*»> 

CAC TTC ACT CAG CTG. GCC ATC CTC TCT CTG CAC GAG ATA GTT CAC TTT 939 
His Phe Thr Clu Uu Ala lie Val Ser Val Cln Glu lie Val Asp Phe 
250 255 260 

CCT AAA CAG CTA CCC CGC TTC CTC CAG CTC ACC CCG CAC CAC CAC ATT 



987 



S £ 8S 2S SS 35 K ul SS l£ =; 25 «. «„ «„ u. 

265 270 
rrr CTC CIO AAC ACC TCT CCG ATC CAG CTG ATG CTT CTG OAC ACA TCT 1035 
AU "1 2S $ Thr ler Al. II. Clu Vjl Met Uu Uu Clu Thr S.r 
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CCC AGG TAC AAC CCT GGC ACT GAG ACT ATC ACC TTC CTC AAC CAT TTC 1083 
Arg Arg Tyr Asn Pro Gly Ser Glu Ser lie Thr Phe Leu Lys Asp Phe 

AGT TAT AAC CGG GAA GAC TTT GCC AAA CCA GGC CTC CAA CTC GAA TTC 1131 
Ser Tyr Asn Arg Clu Asp Phe Ala Lys Ala Gly Leu Gin Val Clu Phe 

_ 315 . . .. „ . . 320 _ -325. - _ 

ATC AAC CCC ATC TTC GAG TTC TCC AGG CCC ATC AAT GAG CTC CAA CTC 1179 
He Asn Pro He Phe Clu Phe Ser Arg Ala Met Asn Glu Leu Gin Leu 
330 335 340 

AAT GAT GCC GAC TTT GCC TTG CTC ATT CCT ATC AGC ATC TTC TCT GCA 1227 
Asn Asp Ala Glu Phe Ala Leu Leu He Ala He Ser lie Phe Ser Ala 
345 350 355 360 

GAC CGG CCC AAC GTG CAG GAC CAG CTC CAG GTG GAG AGG CTG CAG CAC 1275 
Asp Are Pro Asn Val Gin Asp Gin Leu Gin Val Glu Arg Leu Gin His 
v 365 370 375 

ACA TAT CTG GAA GCC CTG CAT GCC TAC CTC TCC ATC CAC CAT CCC CAT 1323 
Thr Tyr Val Glu Ala Leu His Ala Tyr Val Ser He His His Pro His 
3 380 385 390 

CAC CGA CTG ATC TTC CCA CCG ATC CTA ATG AAA CTC GTG AGC CTC CGG 1371 
Asp Are Leu Met Phe Pro Arg Met Leu Het Lys Leu Val Ser Leu Arg 
395 400 405 

ACC CTC AGC ACC CTC CAC TCA GAG CAA GTG TTT GCA CTG CCT CTG CAG 1419 
Thr Leu Ser Ser Val His Ser Glu Gin Val Phe Ala Leu Arg Leu Gin 
410 415 420 

CAC AAA AAC CTC CCA CCG CTG CTC TCT CAG ATC TCG GAT GTG CAC GAA 1467 
Asp Lys Lys Leu Pro Pro Leu Leu Ser Glu He Trp Asp Val His Glu 
425 430 435 / 440 

TCACTGTTCT GTCCCCATAT TTTCTGTTTT CTTGCCCCCA TCGCTGACCC CTCGTGGCTC 1527 

CCTCCTAGAA CTGCAACAGA CTCACAACCC CAAACATTCC TGCGACCTCC CCAAGCACAT 1587 

CCTCCCGTCG CATTAAAACA CACTCAAACG CTAAAAAAAV AAAAAAAAAA AAAAAAAAAA 1647 

AAAAAGGAAT TC 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 440 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(II) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Ser Leu Trp Leu Gly Ala Pro Val Pro Asp He Pro Pro Asp Ser 
1 5 10 15 

Ala Val Clu Leu Trp Lys Pro Gly Ala Gin Asp Ala Ser Ser Cln Ala 
20 25 30 

Cln Gly Gly Ser Ser Cys He Leu Arg Glu Clu Ala Arg Met Pro His 
3 35 AO 45 

Ser Ala Gly Cly Thr Ala Clu Pro Thr Ala Leu Leu Thr Arg Ala Glu 

50 55 60 



1659 
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Pro Pro S r Glu Pro Thr Glu He Arg Pr Cln Lys Arg Lys Lys Cly 
65 70 75 ou 

Pro Ala Pro Lys Met Leu Gly Asn Glu Leu Cys Ser Val Cys Gly Asp 
85 90 

LWAla-STrni^ 7 

J 100 105 110- 

Cly Phe Phe Arg Arg Ser Val lie Lys Gly Ala His Tyr lie Cys His 
J . 115 120 125 

Ser Gly Gly His Cys Pro Met Asp Thr Tyr Met Arg Arg Lys Cys Gin 

Glu Cys Arg Leu Arg Lys Cys Arg Gin Ala Gl£ Het Arg Clu Glu C|S 

Val Leu Ser Clu Clu Cln He Arg Leu Lys Lys Leu Lys Arg Gin Glu 
165 170 175 

Glu Glu Gin Ala His Ala Thr Ser Leu Pro Pro Arg Arg Ser Ser Pro 
180 l fi 5 190 

Pro Gin He Leu Pro Cln Leu Ser Pro Glu Gin Leu Gly Met He Clu r 
195 200 205 

Lys Leu Val Ala Ala Cln Gin Gin Cys Asn Arg Are Ser Phe Ser Asp 
210 215 220 

Ar| Leu Arg Val Thr Pro Trp Pro Met Ala Pro Asp Pro His Ser Arg 

Clu Ala Arg Gin Gin Arg Phe Ala His Phe Thr Glu Leu Ala lie Val 
0 245 250 Z55 

Ser Val Gin Clu He Val Asp Phe Ala Lys Cln Leu Pro Cly Phe Leu 
260 265 270 

Gin Leu Ser Arg Glu Asp Cln lie Ala Leu Leu Lys Thr Ser Ala He 
275 280 285 

Clu Val Met Leu Leu Clu Thr Ser Arg Arg Tyr Asn Pro Cly Ser Glu 
290 295 300 

Ser He Thr Phe Leu Lys Asp Phe Ser Tyr Asn Arg Clu Asp Phe Ala 
305 310 315 " u 

Lvs Ala Gly Leu Cln Val Clu Phe He Asn Pro He Phe Clu Phe Ser 
* J 325 330 335 

Are Ala Met Asn Clu Leu Cln Leu Asn Asp Ala Glu Phe Ala Leu Leu 
340 345 350 

He Ala He Ser He Phe Ser Ala Asp Arg Pro Asn Val Cln Asp Cln 
355 360 365 

Leu Gin Val Glu Arg Leu Gin His Thr Tyr Val Clu Ala Leu His Ala 
370 375 380 

Tyr Val Ser He His His Pr His Asp Arg Leu Met Phe Pr Arg Met 
385 390 395 ** w 

Leu Met Lys Leu Val Ser Leu Arg Thr Leu Ser Ser Val His Ser Clu 
405 *10 q " 
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Gin Val Phe Ala Leu Arg Leu Cln Asp Lys Lys Leu Pr Pro Leu Leu 
420 430 

Ser Glu lie Trp Asp Val His Glu 
435 440 



(2) INFORMATION FOR SEQ ID NO: 9: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2009 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vli) IMMEDIATE SOURCE: 

(B) CLONE: XR4 (XR4.SEG) 

(Ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 263.. 1582 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 

GAATTCCCTG GGGATTAATG CGAAAAGTTT TCCCAGGACC TGGCCGATTC TGCCGACCCT 60 

GCCGGACGCC GGCAGCCCCG CCAGAGCCCG CCCCCACAGT CCTCTCCACC CCTCTCCCTA 120 

TCCCCATCCG ACTCACTCAG ACGCTCCTGC TCACTGACAG ATGAACACAA ACCCACCCTA 180 

AAGGCACTCC ATCTGCGCTC AGACCCAGAT CCTCGCACAG CTATCACCAG CCCTGCACCC 240 

CCACGCCAAG TGGGCGTCAG TC ATG CAA CAC CCA CAC GAG GAG ACC CCT GAG 292 

1 5 10 



GCC 
Ala 


CGG 
Arg 


CAA 
Clu 


GAC 
Glu 


GAG 
Glu 
15 


AAA 
Lys 


GAG CAA 
Glu Clu 


GTG 
Val 


GCC 
Ala 
20 


ATG 
Met 


CCT 
Cly 


CAC 
Asp 


CCA 
Cly 


GCC 
Ala 
25 


CCG 
Pro 


340 


GAG 
Glu 


CTC 
Leu 


AAT 
Asn 


GGG 

c \l 


GCA 
Gly 


CCA 
Pro 


CAA CAC 
Glu His 


ACC 
Thr 
35 


CTT 
Leu 


CCT 
Pro 


TCC 
Ser 


AGC 
Ser 


AGC 
Ser 
40 


TGT 
Cys 


CCA 
Ala 


388 


GAC 
Asp 


CTC 
Leu 


TCC 
Ser 
45 


CAC 
Gin 


AAT 
Asn 


TCC 
Ser 


TCC CCT 
Ser Pro 
50 


TCC 
Ser 


TCC 
Ser 


CTC 
Leu 


CTC 
Leu 


CAC 
Asp 
55 


CAC 
Gin 


CTC 
Leu 


CAC 
Gin 


436 


ATG 
Met 


GGC 
Gly 
60 


TGT 
Cys 


GAT 
Asp 


GGC 
Gly 


CCC 
Ala 


TCA GGC 
Ser Gly 
65 


GGC 
Cly 


AGC 
Ser 


CTC 
Leu 


AAC 
Asn 
70 


ATG 
Met 


GAA 
Glu 


TGT 
Cys 


CGG 
Arg 


484 


GTG 
Val 
75 


TGC 
Cys 


GGG 
Gly 


CAC 
Asp 


AAC 
Lys 


GCC 
Ala 
80 


TCG GGC 
Ser Gly 


TTC 
Phe 


CAC 
His 


TAC 


GCG 
Cly 


CTC 
Val 


CAC 
His 


CCG 
Ala 


TCC 
Cys 
90 


532 


GAG 
Glu 


GCG 
Gly 


TCC 
Cys 


AAG 
Lys 


GGC 
Cly 
95 


TTC 
Phe 


TTC CCC 
Phe Arg 


CGG 
Arg 


ACA 
Thr 
100 


ATC 
He 


CCC 
Arg 


ATG 
Met 


AAG 
Lys 


CTC 
Leu 
105 


GAG 
Glu 


580 
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TAT GAG AAG TCC GAT CCC ATC TCC AAG ATC CAC AAC AAG AAC CCC AAC 
Tyr Glu Ly« Cys Asp Arg He Cys Lys He Gin Lys Lys Asn Arg Asn 
110 115 120 

AAG TCT CAC TAC TCC CCC TTC CAG AAG TCC CTG CCA CTC CCC ATC TCC 
Lys Cys Cln Tyr Cys Arg Phe Gin Lys Cys Leu Ala Leu Gly Met Ser 
3 125 130 135 

CAC AAC CCT ATC CGC TTT CGA CCC ATC CCC CAC CCC CAG AAC AGG AAG 
His Asn Ala He Arg Phe Gly Arg Met Pro Asp Gly Clu Lys Arg Lys 
140 145 150 

CTC GTC GCG GGG CTC ACT CCC ACC CAG CCC TCC CAG CAC AAC CCC CAC 
Leu Val Ala Gly Leu Thr Ala Ser Clu Gly Cys Gin His Asn Pro Cln 
155 160 165 170 

CTC GCC CAC CTG AAG CCC TTC TCT AAG CAC ATC TAC AAC CCC TAC CTG 
Leu Ala Asp Leu Lys Ala Phe Ser Lys His He Tyr Asn Ala Tyr Leu 
v 175 180 185 

AAA AAC TTC AAC ATC ACC AAA AAG AAG GCC CGC AGC ATC CTC ACC GCC 
Lys Asn Phe Asn Met Thr Lys Lys Lys Ala Arg Ser He Leu Thr Gly 
3 190 195 200 

AAG TCC AGC CAC AAC CCA CCC TTT CTC ATC CAC CAC ATC GAG ACA CTG 
Lys Ser Ser His Asn Ala Pro Phe Val He His Asp He Clu Thr Leu 
3 205 210 215 • 

TCC CAG GCA CAG AAG CCC CTC CTG TCC AAA CAG CTC CTG AAC CTG CCG 
Tn> Gin Ala Glu Lys Gly Leu Val Trp Lys Gin Leu Val Asn Val Pro 
220 225 230 

CCC TAC AAC CAC ATC ACT CTG CAC GTC TTC TAC CGC TCC CAG TCC ACC 
Pro Tyr Asn Clu He Ser Val His Val Phe Tyr Arg Cys Cln Ser Thr 
235 240 245 250 

ACA CTG CAG ACA CTC CGA GAG CTC ACC CAG TTC CCC AAC AAC ATC CCC 
Thr Val Clu Thr Val Arg Glu Leu Thr Glu Phe Ala Lys Asn lie Pro 
255 " 260 265 

AAC TTC ACC ACC CTC TTC CTC AAT CAC CAC GTC ACC CTC CTC AAC TAT 
Asn Phe Ser Ser Leu Phe Leu Asn Asp Cln Val Thr Leu Leu Lys Tyr 
270 275 280 

CCC CTG CAC CAC CCC ATC TTT CCC ATC CTC CCC TCC ATC CTC AAC AAA 
Cly Val His Clu Ala He Phe Ala Met Leu Ala Ser He Val Asn Lys 
' 285 290 295 

CAC CGC CTG CTC CTC CCC AAC CGC ACT CCC TTC CTC ACC CAC CAG TTC 
Aso Civ Leu Leu Val Ala Asn Cly Ser Cly Phe Val Thr His Clu Phe 
V 300 305 310 

TTC CCA ACT CTC CCC AAC CCC TTC ACT CAC ATC ATT CAC CCC AAG TTC 
Leu Arg Ser Leu Arg Lys Pro Phe Ser Asp He He Clu Pro Lys Phe 
.315 320 325 330 



CAC TTT CCT CTC AAG TTC AAT GCG CTG GAC CTC CAT CAC AGT CAC CTG 
Glu Phe Ala Val Lys Phe Asn Ala Leu Glu Leu Asp Asp Ser Asp Leu 
335 340 345 

CCC CTC TTC ATC CCC CCC ATC ATT CTC TCT CCA CAC CCC CCA CCC CTC 
Ala Leu Ph He Ala Ala He He Leu Cys Cly Asp Arg Pro Cly Leu 
350 355 360 

ATC AAT CTC CCC CAC CTA GAA GCC ATC CAG CAC ACC ATT CTG CCC CCT 
Met Asn Val Pr Cln Val Clu Ala He Cln Asp Thr lie Leu Arg Ala 
365 370 375 



628 



676 



724 



772 



820 



868 



916 



964 



1012 



1060 



1108 



1156 



1204 



1252 



1300 



1348 



1396 
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CTA GAA TTC CAT CTC CAG CTC AAC CAC CCT CAC ACC CAC TAC CTC TTC 1444 
Leu Glu Phe His Leu Gin Val Asn His Pr Asp Ser Gin Tyr Leu Phe 
380 385 390 

CCC AAG CTG CTG CAG AAG ATC CCA GAC CTG CCG CAC GTC CTC ACT GAG 1492 
Pro Lys Leu Leu Gin Lys Kec Ala Asp Leu Are His Val Val Thr Glu - 

__395— 40°- " 4G * 4~ 10 

CAT CCC CAG ATG ATC CAG TGG CTA AAG AAG ACC CAG ACT GAG ACC TTC 1540 
His Ala Gin Het Met Gin Trp Leu Lys Lys Thr Glu Ser Glu Thr Leu 
415 420 425 

CTG CAC CCC CTC CTC CAG GAA ATC TAC AAC CAC ATC TAC TAAGCCCCCA 1589 
Leu His Pro Leu Leu Gin Glu He Tyr Lys Asp Met Tyr 

430 435 440 

CCCCACCCCT CCCCTCAGCC TCTGCTCCCC CCACCCACCG ACTCTTCAGA GGACCACCCA 1649 

CAGGCACTGC CACTCAAGCA GCTAGACCCT ACTCACAACA CTCCACACAC CTCCCCCAGA 1709 

CTCTTCCCCC AACACCCCCA CCCCCACCAA CCCCCCCATT CCCCCAACCC CCCTCCCCCA 1769 

CCCCGCTCTC CCCATGCCCC CTTTCCTGTT TCTCCTCAGC ACCTCCTCTT CTTGCTCTCT 1829 

CCCTAGCGCC CTTGCTCCCC CCCCTTTGCC TTCCTTCTCT AGCATCCCCC TCCTCCCAGT 1889 

CCTCACATTT GTCTGATTCA CAGCAGACAG CCCCTTGCTA CCCTCACCAG CAGCCTAAAA 1949 

GCAGTGGGCC TCTGCTCCCC CAGTCCTCCC TCTCCTCTCT ATCCCCTTCA AAGGGAATTC 2009 



(2) INFORMATION FOR SEQ ID NO: 10: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 439 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION SEQ ID NO: 10: 

Met Glu Gin Pro Gin Glu Clu Thr Pro Clu Ala Arg Glu Glu Clu Lys 
15 10 15 

Clu Glu Val Ala Het Gly Asp Cly Ala Pro Clu Leu Asn Cly Gly Pro 
20 25 30 

Glu His Thr Leu Pro Ser Ser Ser Cys Ala Asp Leu Ser Gin Asn Ser 
35 40 45 

Ser Pro Ser Ser Leu Leu Asp Gin Leu Gin Het Gly Cys Asp Cly Ala 
50 55 60 

Ser Gly Cly Ser Leu Asn Met Clu Cys Arg Val Cys Cly Asp Lys Ala 
65 70 75 80 

Ser Gly Phe His Tyr Cly Val His Ala C^s Clu Cly Cys Lys Cly^ Phe 

Phe Arg Arg Thr He Arg Met Lys Leu Glu Tyr Glu Lys Cys Asp Arg 
100 105 110 

He Cys Lys He Gin Lys Lys Asn Arg Asn Lys Cys Cln Tyr Cys Arg 
115 120 125 
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Ph Gin Lv« Cys Leu Ala Leu Cly Met Ser His Asn Ala He Arg Phe 
130 135 UO 

Gly Are Met Pro Asp Cly Glu Lys Arg Lys Leu Val Ala Gly Leu Thr 
145 150 155 160 

Ala S r Glu Gly Cys Gin His Asn Pro Gin Leu Ala Asp Leu Lys Ala 
165 170 175 

Phe Ser Lys His He Tyr Asn Ala Tyr Leu Lys Asn Phe Asn Met Thr 
180 185 190 

Lys Lys Lys Ala Arg Ser He Leu Thr Gly Lys Ser Ser His Asn Ala 
J 195 200 205 

Pro Phe Val He His Asp He Glu Thr Leu Trp Gin Ala Glu Lys Gly 
210 215 220 

Leu Val Trp Lys Gin Leu Val Asn Val Pro Pro Tyr Asn Glu He Ser 
225 230 235 240 

Val His Val Phe Tyr Arg Cys Gin Ser Thr Thr Val Glu Thr Val Arg 
245 250 255 

Glu Leu Thr Glu Phe Ala Lys Asn He Pro Asn Phe Ser Ser Leu Phe 
260 265 270 

Leu Asn Asp Gin Val Thr Leu Leu Lys Tyr Gly Val His Glu Ala He 
275 280 285 

Phe Ala Met Leu Ala Ser He Val Asn Lys Asp Gly Leu Leu Val Ala . 
290 295 300 

Asn Gly Ser Gly Phe Val Thr His Glu Phe Leu Arg Ser Leu Arg Lys 
305 310 315 320 

Pro Phe Ser Asp He He Glu Pro Lys Phe Glu Phe Ala Val Lys Phe 
325 330 335 

Asn Ala Leu Clu Leu Asp Asp Ser Asp Leu Ala Leu Phe He Ala Ala 
340 345 350 

He He Leu Cys Cly Asp Arg Pro Cly Leu Met Asn Val Pro Cln Val 
355 * 360 365 

Glu Ala Ha Gin Asp Thr He Leu Arg Ala Leu Clu Phe His Leu Gin 
370 375 380 

Val Asn His Pro Asp Ser Cln Tyr Leu Phe Pro Lys Leu Leu Cln Lys 
385 390 395 400 



385 390 
Are His 

405 410 415 

Thr Clu Ser Glu Thr Leu Leu His Pro Leu 
420 425 430 

Clu He Tyr Lys Asp Met Tyr 



Met Ala Asp Leu Arg His Val Val Thr Clu His Ala Gin Met Met Gin 

•"5 

Trp Leu Lys Lys Thr Clu Ser Glu Thr Leu Leu His Pro Leu Leu Gin 
2C 
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(2) INFORMATION FOR SEQ ID NO: 11: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2468 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

._. (D) TOPOLOGY: linear _ _ _ 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: XR5 (XR5.SEC) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 
(E) LOCATION: 1..1677 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

CAA TTC CGG CCC GGA GGG GCG CGG CGC GAG CGG CCG GAG CCC GGC GCC 48 
Clu Phe Arg Arg Gly^ Gly Ala Arg Arg Clu Cly Pro Clu Pro Gly Cly 

TCA GGG GCC CAG AGA GIG CCG CGG CCG AGA GCC TGC CGG CCC CTG ACA 96 
Ser Gly Ala Gin Arg Val Arg Arg Pro Arg Ala Cys Are Pro Leu Thr 
20 25 30 

CCC CCC TCC CCC CCT CCA AGA CCA GGA CGA CCA CTA CCA ACC CCC AAG 144 
Ala Pro Ser Pro Arg Cly Arg Pro Cly Arg Arg Leu Arg Arg Arg Lys 
35 40 45 

TCA TGG CGC ACC ACC CAA CGC CGA GAG CGC CCT GAG CAC CGC CCC ATG 192 
Ser Tjp Arg Ser Ser Glu Aj| Arg Glu Gly Pro Glu His Arg Arg Met 

GAG CGG GAC CAA CGG CCA CCT AGC GGA GGG GGA GGC GCC CGG CGC TCG 240 
Clu Arg Asp Glu Arg Pro Pro Ser Cly Cly Cly Gly Cly Gly Cly Ser 
65 70 73 80 

GCG CGG TTC CTG GAG CCG CCC GCC CCG CTC CCT CCG CCG CCG CCC AAC 288 
Ala Gly Phe Leu Glu Pro Fro Ala Ala Leu Pro Pro Pro Pro Arg Asn 
85 90 95 

CCT TTC TGT CAG GAT GAA TTC CCA GAG CTT GAT CCA GGC ACT AAT GGA 336 
Gly Phe Cys Gin Asp Glu Leu Ala Glu Leu Asp Pro Gly Thr Asn Gly 
100 105 110 

GAG ACT CAC ACT TTA ACA CTT CGC CAA CGC CAT ATA CCT CTT TCC CTC 384 
Glu Thr Asp Ser Leu Thr Leu Cly Gin Cly His He Pro Val Ser Val 
115 120 125 

CCA GAT CAT CGA CCT GAA CAA CGA ACC TGT CTC ATC TGT CGG CAC CGC 432 
Pro Asp Asp Arg Ala Clu Gin Arg Thr Cys Leu He Cys Cly Asp Are 
130 135 140 

CCT ACC GCC TTC CAC TAT GGG ATC ATC TCC TCC CAC CCC TGC AAG GGG 480 
Ala Thr Gly Leu His Tyr Gly He He Ser Cys Glu Gly Cys Lys Cly 
145 150 .155 160 

TTT TTC AAG ACG ACC ATT TGC AAC AAA CCG CTC TAT CCG TGC ACT CCT 528 
Phe Phe Lys Arg Ser He Cys Asn Lys Arg Val Tyr Arg Cys Ser Are 
165 170 175 
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CAC AAC AAC TCT CTC ATC TCC CCC AAG CAC AGO AAC ACA TCT CAC TAC 576 
Asp Lys Asn Cjs Val Met Ser Arg Lys Cln Arg Asn Arg gs Gin Tyr 

TCC CCC CTC CTC AAC TCT CTC CAC ATC CCC ATC AAC AGO AAG CCT ATC 624 
Cvs Are Leu Leu Lys Cys Leu Cln Met Gly Met Asn Are Lys Ala He 

_195 — 200 205 



ACA GAA GAT CCC ATC CCT GGA GCC CCC AAC AAG ACC ATT CCA CCA CTC 672 
Are Clu Asp Cly Met Pro Gly Gly Arg Asn Lys Ser He Gly Pro Val 
6 210 215 220 

CAC ATA TCA CAA GAA CAA ATT CAA ACA ATC ATC TCT CCA CAG GAC TTT 720 
Gin He Ser Clu Clu Glu He Clu Arg He Met Ser Cly Cln Clu Phe 
225 230 235 240 

GAC CAA CAA GCC AAT CAC TGG ACC AAC CAT CCT CAC ACC CAC CAC ACT 768 
Glu Clu Clu Ala Asn His Trp Ser Asn His Cly Asp Ser Asp His Ser 
245 250 255 

TCC CCT GCC AAC ACC CCT TCA CAG ACC AAC CAC CCC TCA CCA CCC TCC 816 
Ser Pro Gly Asn Arg Ala Ser Glu Ser Asn Cln Pro Ser Pro Gly Ser 
260 265 270 

ACA CTA TCA TCC ACT AGC TCT CTC CAA CTA AAT GGA TTC ATC CCA TTC 864 
Thr Leu Ser Ser Ser Arg Ser Val Glu Leu Asn Gly Phe Met Ala Phe 
275 280 285 

AGG GAT CAG TAC ATC GGG ATC TCA CTC CCT CCA CAT TAT CAA TAC ATA 912 
Are Asp Cln Tyr Met Gly Met Ser Val Pro Pro His Tyr Gin Tyr He 
6 290 295 300 

CCA CAC CTT TTT ACC TAT TCT GCC CAC TCA CCA CTT TTC CCC CCA CAA 960 
Pro His Uu Phe Ser Tyr Ser Gly His Ser Pro Uu Uu Pro Pro Cln 
305 310 • 315 320 

CCT CCA ACC CTC CAC CCT CAC TCC TAC ACT CTC ATT CAT CAC CTC ATC 1008 
Ala Are Ser Uu Asp Pro Cln Ser Tyr Ser Uu He His Gin Uu Met 
. 325 330 335 

TCA CCC CAA CAC CTC GAC CCA TTG GCC ACA CCT ATC TTG ATT GAA CAT 1056 
Ser Ala Glu Asp Uu Glu Pro Uu Gly Thr Pro Met Uu He Glu Asp 
340 345 350 

CCC TAT CCT CTC ACA CAG CCA GAA CTC TTT CCT CTC CTT TCC CCC CTC 1104 
Cly Tyr Ala Val Thr Cln Ala Glu Uu Phe Ala Uu Uu Cys Arg Uu 
355 360 365 

CCC CAC GAC TTC CTC TTT ACC CAC ATT CCC TCC ATC AAC AAC CTC CCT 1152 
Ala Asp Clu Uu Uu Phe Arg Cln He Ala Trp lie Lya Lys Uu Pro 
370 375 380 

TTC TTC TCC CAC CTC TCA ATC AAG CAT TAC ACC TCC CTC TTC ACC TCT 1200 
Phe Phe Cys Clu Uu Ser He Lys Asp Tyr Thr Cys Uu Uu Ser Ser 
385 390 395 400 

ACC TCC CAC CAC TTA ATC CTC CTC TCC TCC CTC. ACA CTC TAC ACC AAG 1248 
Thr Trp Gin Clu Uu He Uu Uu Ser Ser Uu Thr Val Tyr Ser Lys 
• 405 410 415 

CAC ATC TTT CCC CAC CTC CCT CAT CTC ACA CCC AAC TAC TCA CCC TCT 1296 
Cln He Phe Gly Clu Uu Ala Asp Val Thr Ala Lys Tyr Ser Pro S r 
420 425 430 

CAT CAA CAA CTC CAC ACA TTT ACT CAT CAA CCC ATC CAC CTC ATT CAA 1344 
Asp Clu Clu Uu His Arg Phe Ser Asp 'Clu Cly Met Clu Val He Clu 
435 440 445 
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CCA CTC ATC TAC CIA TAT CAC AAC TTC CAT CAC 
Are Leu He Tyr Leu Tyr His Lys Phe Hts Cln 
450 455 

CAC CAC TAC CCA TCC ATG AAA CCA ATT AAC TTC 
Clu Glu Tyr Ala Cys Kee Lys Ala He Asn Phe 

-465 -±--4-70 475 

AGG CGT CTG ACC ACT CCC TCA CAC CTC CAA CAA 
Are Gly Leu Thr Ser Ala Ser Gin Leu Glu Gin 
6 J 485 490 

TCG TAC ATT TGT CAG CAT TTC ACT GAA TAT AAA 
Trp Tyr He Cys Gin Asp Phe Thr Glu Tyr Lys 
500 505 

AAC CCC TTT CCT GAT CTT ATG ATG TCC TTC CCA 
Asn Are Phe Pro Asp Leu HeC Met Cys Leu Pro 
515 520 

CCA GGC AAG ATG CTC AAT CTG CCC CTG CAG CAG 
Ala Gly Lys Met Val Asn Val Pro Uu Clu Cln 
530 535 

AAC GTG CTG CTG CAC TCC TCC AAG ACA ACT ACC 
Lys Val Val Uu His Ser Cys Lys Thr Ser Thr 
545 550 555 

CCTGCACCTC CTTCCGCCAC CCACAGTGCC TTCGGTAGGC 
GACCCAGAGA CCAACATCCA CACTCTGGAC CAGCTACCTC 
TGTTTGTCTC TTTTTAACCT CATTTTTCTA TATATTTATT 
TGGCCTTCAA CATCATCCAC ATCCTTTTGT CTCAATGCAG 
TTACAGAATG TCAACATCTT TAATCTTACC CTCTTGTCAT 
TGTATTTTCA TGGACAGGCT ACCATGGACT ACATCAGTAT 
CAACTACCTC AATCCAAACA CCTCTATCAC CATCCCTACC 
AGATACACAC TTGTCTGTTA CACACCAAAC TCCCTTTTTT 
AAAAGAACCA AACAAAGGAG CCAACTGGTA TACCCAGATT 
ATCTGAGAGG CAATTTCATT TTGATCATCT CATCCCACAA 
CCTTACCTTC TCCTGCACCC CTCCCCCCCC CCACACCCTG 
CAACTTTTCA TCCACCTACA GTCCTAACAA TAAGCCAGTA 
CCCCTTGTAG CTCATAGCIG CCTAGTTTCC TGTTCTAGAT 
ATTC 



CTC AAC CTC AGC AAC 
Leu Lys Val Ser Asn 
460 

CTC AAT CAA CAT ATC 
Leu Asn Cln Asp II 

480- - 

CTC AAC AAC CCC TAT 
Leu Asn Lys Are Tyr 
495 

TAC ACA CAT CAC CCA 
Tyr Thr His Gin Pro 
510 

GAG ATC CCA TAC ATC 
Glu He Are Tyr He 
525 

CTG CCC CTC CTC TTT 
Leu Pro Leu Leu Phe 
540 

GTG AAG CAG TGACCTGTGC 
Val Lys Clu 



AGCACACGCT 
CATCACAAGA 
TCACCACAGA 
CACATCCATT 
TCTTTACAGA 
TTCCATAATC 
TTTTTCCACA 
ATACCCACAG 
TACTAATCCC 
CCCTCAACGC 
TTGTCTCTTG 
TGTACCACTT 
CTACCAAGCC 



CCAGACGAAA 
AGAATTTCTT 
GTTCAATCTA 
TCCTTCCACT 
TACCTTTTTT 
TTCACAAACA 
TTTTCTCAGC 
ACTTCTAACT 
CAGTTCGGAC 
AGAAACTCTC 
ATCCTCCTGT 
CCCTCCCACC 
CTACTTCCGA 



1392 

1440 

1488 

1536 

1584 

1632 

1684 

1744 
1804 
1864 
1924 
1984 
2044 
2104 
2164 
2224 
2284 
2344 
2404 
2464 
2468 
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(2) INFORMATION FOR SEQ ID NO: 12: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 558 amin acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: pr tein 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:12: 

Clu Phe Arg Arg Cly Cly Ala Arg Arg Glu Cly Pro Clu 
1 5 10 

Ser Cly Ala Gin Arg Val Arg Arg Pro Arg Ala Cys Arg 
20 25 

Ala Pro Ser Pro Arg Cly Arg Pro Cly Arg Arg Leu Are 
35 40 45 

Ser Trp Arg Ser Ser Glu Are Arg Clu Cly Pro Clu His 
50 55 60 

Glu Arg Asp Glu Arg Pro Pro Ser Gly Gly Gly Gly Cly Cly Gly Ser 
65 70 75 80 

Ala Gly Phe Leu Clu Pro Pro Ala Ala Leu 
85 90 

Cly Phe Cys Gin Asp Clu Leu Ala Clu Leu 
100 105 

Clu Thr Asp Ser Uu Thr Leu Cly Cln Gly His He Pro Val Ser Val 
115 120 125 

Pro Asp Asp Arg Ala Clu Gin Arg Thr Cys Uu lie 
130 135 WO 

Ala Thr Cly Uu His Tyr Cly He He Ser Cvs Clu Cly Cys Lys Cly 
145 150 135 160 

Phe Phe Lys Arg Ser He Cys Asn Lys 

Asp Lys Asn Cys Val Met Ser Arg Ljs Gin Arg Asn Arg Cys Cln Tyr 

Cys Arg Uu Uu Lys Cys Uu Cln Met Gly Met Asn Are 

Arg Clu Asp Cly Met Pro Cly Cly Arg Asn Lys Ser lie Cly Pro Val 
210 215 220 

Gin He Ser Clu Clu Glu He Clu Arg He Het Ser Cly 
225 230 235 

Clu Clu Glu Ala Asn His Trp Ser Asn His Cly Asp Ser 
245 250 

S r Pro Cly Asn Arg Ala Ser Clu Sar Asn Cln Pro Ser 
260 265 

Thr Uu Ser Ser Ser Arg Ser Val Clu Uu Asn Cly Phe 
275 280 285 

Arg Aso Cln Tyr Met Cly Met S r Val Pr Pr His Tyr 



Pro 


Gly 
15 


Cly 


Pro 
30 


Uu 


Thr 


Arg 


Arg 


Lys 


Arg 


Arg 


Met 


Gly 


Gly 


Ser 
80 


Pro 




Asn 


Thr 

110 


Asn 


Cly 


Val 


Ser 


Val 


Cly 


Asp 


Arg 


Cys 


Lys 


Gly 
160 


Cys 


Ser 
175 


Arg 


Cys 
190 


Cln 


Tyr 


Lys 


Ala 


He 


Cly 


Pro 


Val 


Cln 


Glu 


Fhe 
240 


Asp 


His 
255 


Ser 


Pro 
270 


Cly 


Ser 


Met 


Ala 


Phe 


Cln 


Tyr 


He 
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Pro His Leu Ph Ser Tyr Ser Cly His Ser Pro Leu Leu Pro Pr Cln 
305 310 315 320 

Ala Are Ser Leu Asp Pro Cln Ser Tyr Ser Leu II His Cln Leu Hec 
325 330 335 



Ser Ala Clu Asp Leu Clu Pro Leu Cly Thr Pro Met Leu lie Clu Asp 
340 345 350 

Cly Tvr Ala Val Thr Cln Ala Clu Leu Phe Ala Leu Leu Cys Arg Leu 
355 360 365 

Ala Asp Clu Leu Leu Phe Arg Cln He Ala Trp He Lys Lys Leu Pro 
370 375 380 

Phe Phe Cys Clu Leu Ser He Lys Asp Tyr Thr Cys Leu Leu Ser Ser 
385 390 395 400 

Thr Trp Gin Clu Leu He Leu Leu Ser Ser Leu Thr Val Tyr Ser Lys 
405 410 415 

Cln He Phe Cly Clu Leu Ala Asp Val Thr Ala Lys Tyr Ser Pro Ser 
420 425 430 

Asp Clu Clu Leu His Arg Phe Ser Asp Clu Cly Hec Clu Val lie Clu 
435 440 * 445 

Are Leu He Tyr Leu Tyr His Lys Phe His Cln Leu Lys Val Ser Asn 
450 455 460 

Clu Clu Tyr Ala Cys Met Lys Ala He Asn Phe Leu Asn Cln Asp He 
465 470 475 480 

Arg Gly Leu Thr Ser Ala Ser Cln Leu Clu Cln Leu Asn Lys Are Tyr 
1 485 490 495 

Trp Tyr He Cys Gin Asp Phe Thr Clu Tyr Lys Tyr Thr His Gin Pro 

Asn Arg Phe Pro Asp Leu Het Met Cys Leu Pro Clu He Arg Tyr He 
515 520 525 

Ala Gly Lys Met Val Asn Val Pro Leu Glu Cln Leu Pro Leu Leu Phe 
530 535 540 

Lvs Val Val Leu His Ser Cys Lys Thr Ser Thr Val Lys Clu 
545 550 555 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2315 base pairs 

(B) TYPE: nucleic acid 
(■C> STRAND EDNESS : single 
(D) - TOPOLOGY: linear 

(11) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: XR79 (XR79.SEQ) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 204.. 2009 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 13 : 

CCGTTAGAAA AGGTTCAAAA TACGCACAAA CTCCTGAAAA TATCCTAACT CACCCCAAGT 60 

AACATAACTT TAACCAAGTG CCTCCAAAAA TAGATGTTTT TAAAAGCTCA AGAATGGTCA 120 

TAACAGACCT CCAATAAGAA TTTTCAAAGA GCCAATTATT TATACACCCC ACCACTATTT 180 

TTTAGCCCCC TCCTCTGCCC ACA ATC GAC CGC CTT AAG CTT GAC ACC TTC 230 

Met Asp Gly Val Lys Val Glu Thr Phe 
1 5 

ATC AAA ACC GAA GAA AAC CCA CCG ATG CCC TTG ATC CGA GGA CGC ACT 278 
He Lys Ser Glu Glu Asn Arg Ala Met Pro Leu He Gly Gly Gly Ser 
10 15 20 25 

CCC TCA CGC CCC ACT CCT CTG CCA CGA CGC CGC CTG CCA ATG GGA CCC 326 
Ala Ser Gly Gly Thr Pro Leu Pro Cly Cly Gly Val Gly Het Gly Ala 
30 35 40 

GGA CCA TCC CCA ACC TTC ACC CTG CAG CTC TCT TTC GTC TCC CCC CAC 374 
Gly Ala Ser Ala Thr Leu Ser Val Glu Leu Cys Leu Val Cys Gly Asp 
45 50 55 

CCC CCC TCC CGC CCG CAC TAC CGA GCC ATA AGC TCC GAA CCC TCC AAC 422 
Arg Ala Ser Cly Arg His Tyr Gly Ala He Ser Cys Glu Cly Cys Lys 
60 65 '70 

GGA TTC TTC AAG CGC TCC ATC CCG AAG CAC CTG CGC TAC CAG TCT CCC 470 
Cly Phe Phe Lys Arg Ser He Arg Lys Gin Leu Gly Tyr Cln Cys Arg 
75 80 85 



GGG CCT ATG AAC TCC GAG GTC ACC AAG CAC CAC AGG AAT CCG TCC CAG 518 
Gly Ala Met Asn Cys Glu Val Thr Lys His His Arg Asn Arg Cys Cln 
90 95 100 105 

TTC TGT CCA CTA CAC AAG TCC CTG GCC ACC CGC ATC CGA ACT CAT TCT 566 
Phe Cys Arg Leu Gin Lys Cys Leu Ala Ser Gly Met Arg Ser Asp Ser 
110 115 120 



CTC CAG CAC CAG AGG AAA CCG ATT CTC CAC AGG AAC GAG GGG ATC ATC 614 
Val Gin His Glu Arg Lys Pro He Val Asp Arg Lys Glu Gly He He 
125 130 135 

CCT CCT GCC CCT AGC TCA TCC ACT TCT CGC CGC CCT AAT CCC TCC TCC 662 
Ala Ala Ala Cly Ser Ser Ser Thr Ser Cly Cly Cly Asn Cly Ser Ser 
140 145 150 
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ACC TAC CTA TCC CCC AAG TCC GGC TAT CAG CAG GGC CCT GCC AAC GGG 710 
Thr Tyr Leu Ser Cly Lys Ser Gly Tyr Gin Gin Civ Arg Gly Lys Gly 
155 160 165 

CAC ACT GTA AAG GCC GAA TCC GCG CCA CGC CTC CAG TGC ACA GCG CGC . 758 
His Ser Val Lys Ala Glu Ser Ala Pro Are Leu Gin Cys Thr Ala Are 
170 175 180 185 

CAG CAA CGG CCC TTC AAT TTC AAT GCA GAA TAT ATT CCG ATG GGT TTG 806 
Gin Gin Are Ala Phe Asn Leu Asn Ala Glu Tyr He Pro Het Cly Leu 
190 195 200 

AAT TTC GCA GAA CTA ACG CAG ACA TTG ATG TTC GCT ACC CAA CAC CAG 854 
Asn Phe Ala Glu Leu Thr Gin Thr Leu MeC Phe Ala Thr Gin Gin Gin 
205 210 215 

CAG CAA CAA CAG CAA CAG CAT CAA CAG ACT GGT AGC TAT TCC CCA GAT 902 
Gin Gin Gin Gin Gin Gin His Gin Cln Ser Gly Ser Tyr Ser Pro Asp 
220 225 230 

ATT CCG AAG CCA GAT CCC CAG CAT CAC GAG GAC GAC TCA ATG GAC AAC 950 
lie Pro Lys Ala Asp Pro Glu Asp Asp Glu Asp Asp Ser Met Asp Asn 
235 240 245 

AGC ACC ACG CTC TGC TTG CAG TTC CTC .CCC AAC ACC CCC ACC AAC AAC 9~98 
Ser Ser Thr Leu Cys Leu Gin Leu Leu Ala Asn Ser Ala Ser Asn Asn 
250 255 260 265 

AAC TCC CAG CAC CTC AAC TTT AAT CCT CGG GAA CTA CCC ACC CCT CTC 1046 
Asn Ser Gin His Leu Asn Phe Asn Ala Gly Glu Val Pro Thr Ala Leu 
270 275 280 

CCT ACC ACC TCG ACA ATG GGC CTT ATT CAG ACT TCC CTC GAC ATG CCG 1094 
Pro Thr Thr Ser Thr Met Gly Leu He Gin Ser Ser Leu Asp Met Arg 
285 290 295 

CTC ATC CAC AAC GCA CTC CAC ATC CTC CAG CCC ATC CAA AAC CAA CTG 1142 
Val He His Lys Gly Leu Cln He Leu Gin Pro He Gin Asn Gin Leu 
300 305 310 

CAG CCA AAT CCT AAT CTC ACT CTC AAG CCC CAC TGC CAT TCA CAG CCC 1190 
Glu Are Asn Gly Asn Leu Ser Val Lys Pro Glu Cys Asp Ser Clu Ala 
315 320 325 

GAG GAC ACT CCC ACC CAG GAT GCC GTA GAC GCC GAG CTG CAG CAC ATC 1238 
Clu Asp Ser Cly Thr Clu Asp Ala Val Asp Ala Clu Leu Glu His Met 
330 335 340 345 

CAA CTA GAC TTT GAG TCC GGT GGC AAC CCA AGC CCT CCA ACC CAT TTT 1286 



Clu Leu Asp Phe Glu Cys Cly Gly Asn Ar, 



350 



35 



I 



Ser Cly Gly Ser Asp Phe 
360 



CCT ATC AAT GAC GCG CTC TTT GAA CAC GAT CTT CTC ACC CAT CTC CAC 1334 
Ala lie Asn Glu Ala Val Phe Clu Gin Asp Leu Leu Thr Asp Val Gin 
365 370 375 

TCT CCC TTT CAT CTC CAA CCC CCG ACT TTC CTG CAC TCG TAT TTA AAT 1382 
Cys Ala Phe His Val Gin Pro Pro Thr Leu Val His Ser Tyr Leu Asn 
380 385 390 

ATT CAT TAT GTC TCT CAC ACC CGC TCG CCA ATC ATT TTT CTC ACC ATC 1430 
He His Tyr Val Cys Clu Thr Gly Ser Arg He He Phe Leu Thr He 
395 400 405 

CAT ACC CTT CCA AAG CTT CCA CTT TTC CAA CAA TTC CAA CCC CAT ACA 1478 
His Thr Leu Are Lys Val Pro Val Phe Clu Gin Leu Glu Ala His Thr 

410 415 420 425 
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CAG GTC AAA CTC CTG AGA GGA CTC TGG CCA GCA TTA ATG CCT ATA CCT 1526 
Gin Val Lys Leu Leu Are Cly Val Trp Pro Ala Leu Met Ale lie Ale 
430 435 440 

TTG GCC CAG TGT CAG CCT CAG CTT TCG CTG CCC ACC ATT ATC CCC CAG 1574 

Leu Ale Gin Cys Gin Gly Gln_Leu Ser _Vel_Pro JIhr_I.le_Ue_Cly_C.ln_ _ 

4^5 450 455 

TTT ATT CAA AGC ACT CGC CAG CTA CCG GAT ATC GAT AAG ATC CAA CCG 1622 
Phe He Gin Ser Thr Arg Gin Leu Ale Asp He Asp Lys He Glu Pro 
460 465. 470 



TTG AAG ATC TCG AAC ATC GCA AAT CTC ACC ACC ACC CTC CAC CAC TTT 1670 
Leu Lys He Ser Lys Met Ale Asn Leu Thr Arg Thr Leu His Asp Phe 
475 480 485 

CTC CAG CAG CTC CAG TCA CTG CAT CTT ACT CAT ATG GAG TTT CGC TTG 1718 
Vel Gin Glu Leu Cln Ser Leu Asp Vel Thr Asp Met Clu Phe Gly Leu 
490 495 500 505 

CTG CCT CTC ATC TTG CTC TTC AAT CCA ACC CTC TTC CAG CAT CCC AAC 1766 
Leu Are Leu He Leu Leu Phe Asn Pro Thr Leu Phe Gin His Are Lys 
5 510 515 520 

GAG CGG TCG TTG CCA GCC TAC CTC CGC AGA GTC CAA CTC TAC CCT CTG 1814 
Glu Arg Ser Leu Arg Gly Tyr Val Are Arg Val Gin Leu Tyr Ala Leu 
525 530 535 

TCA ACT TTG AGA AGC CAG CCT GCC ATC GGC CGC GCC CAG GAG CCC TTT 1862 
Ser Ser Leu Arg Arg Cln Cly Cly He Cly Cly Cly Clu Clu Arg Phe 
540 545 550 

AAT CTT CTG GTC CCT CCC CTT CTT CCC CTC AGC AGC CTC CAC CCA GAC 1910 
Asn Val Leu Val Ala Arg Leu Leu Pro Leu Ser Ser Leu Asp Ale Glu 
555 560 565 

GCC ATC CAG GAC CTC TTC TTC GCC AAC TTC CTG GCC CAC ATC CAC ATC 1958 
Ala Met Glu Glu Leu Phe Phe Ala Asn Leu Val Gly Cln Met Cln Met 
570 575 580 585 

CAT CCT CTT ATT CCC TTC ATA CTG ATG ACC AGC AAC ACC ACT GGA CTG 2006 
Asp Ala Leu He Pro Phe He Leu Met Thr Ser Asn Thr Ser Civ Leu 
590 595 600 

TAGGCGGAAT TGAGAACAAC ACGGCCCAAG CACATTCCCT AGACTGCCCA AAACCAACAC 2066 

TCAAGATCCA CCAAGTGCCG CCAATACATG TAGCAACTAG CCAAATCCCA TTAATTATAT 2126 

ATTTAATATA TACAATATAT ACTTTACCAT ACAATATTCT AACATAAAAC CATCACTTTA 2186 

T TC T TG T T CA CACATAAAAT GCAATCCATT TCCCAATAAA ACCCAATATC TTTTTAAACA 2246 

CAATCTTTCC ATCAGAACTT TCACATCTAT ACATTAGATT ATTACAACAC AAAAAAAAAA 2306 

AAAAAAAAA 2315 
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(2) INFORMATION FOR SEQ ID NO: 14: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 601 amino acids 

(B) TYPE: amino acid 

CDJ._TOPOWCY:--linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Mec Asp Gly Val Lys Val Glu Thr Phe lie Lys Ser Glu Clu Asn Arg 
1 5 10 15 • 

Ala Hec Pro Leu lie Gly Gly Gly Ser Ala Ser Cly Gly Thr Pro Leu 
20 25 30 



Pro 



Cly Gl£ Gly Val Cly Mec Clj Ala Cly Ala Ser Ala Thr Leu Ser 



Val Glu Leu Cys Leu Val Cys Cly Asp Arg Ala Ser Gly Arg His Tyr 
50 55 60 

Gly Ala He Ser Cys Glu Gly Cys Lys Cly Phe Phe Lys Arg Ser He 
65 70 75 * 80 

Arg Lys Gin Leu Cly Tyr Gin Cys Arg Gly Ala Met Asn Cys Clu Val 
85 90 95 

Thr Lys His His Arg Asn Arg Cys Gin Phe Cys Arg Leu Cln Lys Cys 
100 105 110 

Leu Ala Ser Gly Met Arg Ser Asp Ser Val Cln His Glu Arg Lys Pro 
115 120 125 

He Val Asp Arg Lys Clu Gly He He Ala Ala Ala Gly Ser Ser Ser 
130 * 135 140 

Thr Ser Gly Gly Cly Asn Cly Ser Ser Thr Tyr Leu Ser Cly Lys Ser 
145 150 155 160 

Gly Tyr Cln Gin Cly Arg Cly Lys Gly His Ser Val Lys Ala Clu Ser 
165 170 175 

Ala Pro Arg Leu Gin Cys Thr Ala Arg Gin Gin Arg Ala Phe Asn Leu 
180 185 190 

Asn Ala Clu Tyr He Pro Met Gly Leu Asn Phe Ala Clu Leu Thr Gin 
195 200 205 

Thr Leu Met Phe Ala Thr Gin Gin Gin Gin Cln Gin Gin Gin Gin His 
210 215 220 

Gin Gin Ser Gly Ser Tyr Ser Pro Asp He Pro Lys Ala Asp Pro Glu 
225 230 235 240 

Asp Asp Glu Asp Asp Ser Met Asp Asn Ser Ser Thr Leu Cys Leu Gin 
245 250 255 

Leu Leu Ala Asn Ser Ala Ser Asn Asn Asn Ser Gin His Leu Asn Phe 
260 265 270 

Asn Ala Cly Clu Val Pro Thr Ala Leu Pro Thr Thr Ser Thr Met Cly 
275 280 285 

Leu He Gin Ser Ser Leu Asp Met Arg Val He His Lys Cly Leu Cln 

290 295 200 
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lie Leu Gin Pro lie Gin Asn Gin Leu Clu Are Asn Gly Asn Leu Ser 
305 310 315 320 

Val Lys Pro Clu Cys Asp S r Clu Ala Clu Asp Ser Gly Thr Clu Asp 

325. 330 335 

Ala Val Asp Ala Clu Leu Glu_His_Met ^lu_Leu_As.p_Phe_Glu_Cys._Gly_._" ' .._ 

340 3 * 5 350 

Gly Asn Are Ser Gly Gly Ser Asp Phe Ala lie Asn Clu Ala Val Phe 
355 360 365 

Clu Gin Asp Leu Leu Thr Asp Val Gin Cys Ala Phe His Val Cln Pro 
370 375 380 

Pro Thr Leu Val His Ser Tyr Leu Asn He His Tyr Val Cys Clu Thr 
385 390 395 400 

Gly Ser Are He He Phe Leu Thr He His Thr Leu Are Lys Val Pro 
405 410 415 

Val Phe Clu Cln Leu Glu Ala His Thr Gin Val Lys Leu Leu Arg Gly 
420 425 430 

Val Trp Pro Ala Leu Het Ala He Ala Leu Ala Cln Cys Cln Gly Gin 
435 440 445 

Leu Ser Val Pro Thr He lie Cly Gin Phe He Cln Ser Thr Arg Cln 
450 455 460 

Leu Ala Asp He Asp Lys He Glu Pro Leu Lys He Ser Lys Het Ala 
465 470 475 480 ■ 

Asn Leu Thr Are Thr Leu His Asp Phe Val Cln Clu Leu Cln Ser Leu 
485 490 495 

Asp Val Thr Asp Met Glu Phe Gly Leu Leu Arg Leu He Leu Leu Phe 
500 505 510 

Asn Pro Thr Leu Phe Gin His Arg Lys Clu Arg Ser Leu Arg Cly Tyr 
515 520 525 

Val Arg Arg Val Gin Leu Tyr Ala Leu Ser Ser Leu Arg Arg Cln Gly 
530 535 540 

Gly He Gly Cly Cly Clu Clu Arg Phe Asn Val Leu Val Ala Arg Leu 
545 550 555 560 

Leu Pro Leu. Ser Ser Leu Asp Ala Clu Ala Het Glu Glu Leu Phe Phe 
565 570 575 

Ala Asn Leu Val Cly Cln Het Cln Het Asp Ala Leu He Pro Phe lie 
580 585 590 

Leu Met Thr Ser Asn Thr Ser Gly Leu 
595 600 
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Th at which is claimed is : 



DNiTenco'dihg'a -^yp^m~cBS««^l"M~^jr; 

having a DNA binding domain comprising about 66 amino acids 
with 9 cys residues, wherein said DNA binding domain has: 

(i) less than about 70% amino acid seguence 
5 identity with the DNA binding domain of 

hRAR-alpha; 

(ii) less than about 60% amino acid seguence 
identity with the DNA binding domain of 
hTR-beta; 

10 (iii) less than about 50% amino acid seguence 

, identity with the DNA binding domain of 
hGR; and 

(iv) less than about 65% amino acid seguence 
identity with the DNA binding domain of 
15 hRXR-alpha. 

2. DNA according to Claim 1 wherein the ligand 
binding domain of said polypeptide has: 

(i) less than about 35% amino acid seguence 
2 0 identity with the ligand binding domain 

of hRAR-alpha; 

(ii) less than about 30% amino acid seguence 
identity with the ligand binding domain 
of hTR-beta; 

25 (iii) less than about 25% amino acid seguence 

identity with the ligand binding domain 
of hGR; and 
(iv) less than about 30% amino acid seguence 
identity with the ligand binding domain 

30 of hRXR-alpha. 



WO 93/06215 



PCI7US92/07570 



-59- 

3. DNA according to Claim 1 wherein said 
polypeptide has an overall amino acid sequence identity- of: 

. - (i) less than about 35% relative to-hRAR- 

alpha; 

5 (ii) less than about 35% relative to hTR- 

beta ; 

(iii) less than about 25% relative to hGR; 
and 

(iv) less than about 35% relative to hRXR- 
10 alpha. 

4. DNA according to Claim 1 wherein said 
polypeptide is characterized by having a DNA. binding domain 

comprising [XR1] : 
15 (i) about 68% amino acid sequence identity 

with the DNA binding domain of 
hRAR-alpha ; 

(ii) about 59% amino acid sequence identity 
with the DNA binding domain of 
20 hTR-beta; 

(iii) about 45% amino acid sequence identity 
with the DNA binding domain of hGR; and 
(iv) about 65% amino acid sequence identity 
with the DNA binding domain of 
25 hRXR-alpha. 

5. DNA according to Claim 1 wherein said 
polypeptide is characterized by having a DNA binding domain 
comprising [XR2 ] : 

30 (i) about 55% amino acid sequence identity 

with the DNA binding domain of 
hRAR-alpha ; 

(ii) about 56% amino acid sequence identity 
with the DNA binding domain of 
35 hTR-beta; 
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(iii) about 50% amino acid sequence identity 
with the DNA binding domain of hGR; and 
(iv) about 52% amino acid sequence identity, 
with the DNA binding domain of 
5 hRXR-alpha. 

6. DNA according to Claim 1 wherein said 
polypeptide is characterized by having a DNA binding domain 
comprising [XR4] : 

10 (i) about 62% amino acid sequence identity 

with the DNA binding . domain of 
hRAR-alpha; 

(ii) about 58% amino acid sequence identity 
with the DNA- binding domain of 
15 hTR-beta; 

(iii) about 48% amino acid sequence identity 
with the DNA binding domain of hGR; and 
(iv) about 62% amino acid sequence identity 
with the DNA binding domain of 
20 hRXR-alpha. 

7. DNA according to Claim 1 wherein said 
polypeptide is characterized by having a DNA binding domain 
comprising [XR5] : 

25 (i) about 59% amino acid sequence identity 

with the DNA binding domain of 
hRAR-alpha; 
(ii) about 52% amino acid sequence identity 
with the DNA binding domain of 
30 hTR-beta; 

(iii) about 44% amino acid sequence identity 
with the DNA binding domain of hGR; and 
(iv) about 61% amino acid sequence identity 
with the DNA binding domain of 
35 hRXR-alpha. 
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8. DNA according to Claim 1 wherein said 
polypeptide is characterized by having a DNA binding domain 

comprising— [-ZR7 9-]-: " 

(i) about 59% amino acid sequence identity 
5 with the DNA binding domain . of 

hRAR-alpha; 

(ii) about 55% amino acid sequence identity 
with the DNA binding domain of 
hTR-beta; 

10 (iii) about 50% amino acid sequence identity 

with the DNA binding domain of hGR; and 
(iv) about 65% amino acid sequence identity 
with the DNA binding domain „ of 
hRXR-alpha. 

15 

9. DNA according to Claim 1 wherein the 
nucleotide sequence of said DNA is selected from the 
nucleotide sequence set forth in Sequence ID No. 1, the 
combination of Sequence ID No. 3 and the continuation 

20 thereof as set forth in Sequence ID No. 1, the combination 
of Sequence ID No. 5 and the continuation thereof as set 
forth in Sequence ID No. 1, Sequence ID No. 7, Sequence ID 
No. 9, Sequence ID No. 11, or Sequence ID No. 13. 

25 io. An expression vector comprising DNA 

according to claim 1, and further comprising: 

at the 5' -end of said DNA, a promoter and a 
triplet encoding a translational start codon, and 

at the 3' -end of said DNA, a triplet encoding a 
30 translational stop codon; 

wherein said expression vector is operative in an 
animal cell in culture to express the protein encoded by 
the continuous sequence of amino acid-encoding triplets. 

35 ii. An animal cell in culture transformed with 

an expression vector according to Claim 10. 
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12. A method of making a polypeptide comprising 
culturing the cells of Claim 11 under conditions suitable 

— f ojT the "e^re^ronHDf ^aXd lW^^ d ^ -~ 

5 13. The polypeptide produced by the method of 

Claim 12. 

14. A polypeptide characterized by having a DNA 
binding domain comprising about 66 amino acids with 9 Cys 

10 residues, wherein said DNA binding domain has: 

(i) less than about 70% amino acid sequence 
identity with the DNA binding domain of 

. hRAR-alpha; 

(ii) less than about 60% amino acid sequence 
15 identity with the DNA binding domain of 

hTR-beta; 

(iii) less than about 50% amino acid sequence 
identity with the DNA binding domain of 
hGR; and 

20 (iv) less than about 65% amino acid sequence 

identity with the DNA binding domain of 
hRXR-alpha . 

15. A DNA or RNA labeled for detection; wherein 
25 said DNA or RNA comprises a nucleic acid segment of at 

•least 20 bases in length, wherein said segment has 
substantially the same sequence as a segment of the same 
length selected from the DNA segment represented by bases 
21 -1902, inclusive, of Sequence ID No. 1, bases 1 - 386, 

30 inclusive, of Sequence ID No. 3, bases 10 - 300, inclusive, 
of Sequence ID No. 5, bases 21 - 1615, inclusive, of 
Sequence ID No. 7, bases 21 - 2000, inclusive, of Sequence 
ID No. 9, bases 1 - 2450, inclusive, of Sequence ID No. 11, 
bases 21 - 2295, inclusive, of Sequence ID No. 13, or the 

35 complement of any one of said segments. 
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16. A method of testing a compound for its 
ability to regulate transcription-activating effects of a 
receptor polypeptide, -said -method comprising- assaying for 
the presence or absence of reporter protein upon contacting 
.5 of. cells containing a receptor polypeptide and reporter 
vector with said compound; 

wherein said receptor polypeptide is 
characterized by having a DNA binding domain comprising 
about 66 amino acids with 9 Cys residues, wherein said DNA 

10 binding domain has: 

(i) less than about 70% amino acid sequence 
identity with the DNA binding domain of 
hRAR-alpha; 

(ii) less than about 60% amino acid sequence 
15 identity with the DNA binding domain of 

hTR-beta; 

(iii) less than about 50% amino acid sequence 
identity with the DNA binding domain of 
hGR; and 

20 " (iv) less than about 65% amino acid sequence 

identity with the DNA binding domain of 
hRXR-alpha; and 
wherein said reporter vector comprises: 

(a) a promoter that is operable in said 
25 cell, 

(b) a hormone response element, and 

(c) a DNA segment encoding a reporter 
protein, 

wherein said reporter protein-encoding 
30 DNA segment is operatively linked to said 

promoter for transcription of said DNA 

segment, and 

wherein said hormone response element 
is operatively linked to said promoter for 
35 activation thereof. 
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17. A chimeric receptor comprising at least an 
amino-terminal domain, a DNA-binding domain, .and a 
Xlg^d-Bliirdih^doWirrr 

wherein at least one of the domains thereof 
is derived from the polypeptide of Claim 13; and 
wherein at least one of the domains thereof 
is derived from at least one previously 
identified member of the steroid/ thyroid 
superfamily of receptors. 

18. DNA encoding the chimeric receptor of 

Claim 17 . 

19. A method to identify compounds which act as 
15 ligands for receptor polypeptides according to Claim 13 

comprising: 

assaying for the presence or absence of reporter 
protein upon contacting of cells containing a chimeric form 
of said receptor polypeptide and reporter vector with said 
20 compound; 

wherein said chimeric form of said receptor 
polypeptide comprises the ligand binding domain of said 
receptor polypeptide and the amino-terminal and DNA-binding 
domains of at least one previously identified member of the 
25 steroid/thyroid superfamily of receptors; 

wherein said reporter vector comprises: 

(a) a promoter that is operable in said 
cell, 

(b) a hormone response element which is 
30 responsive to the receptor from which 

the DNA-binding domain of said chimeric 
form of said receptor polypeptide is 
derived, and 

(c) a DNA segment encoding a reporter 
35 protein, 
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wherein said reporter protein- 
encoding DNA segment is operatively 

linked- to said -promoter-" for. 



transcription of said DNA segment, and 
5 wherein said hormone response 

element is operatively linked to said 
promoter for activation thereof, and 
thereafter 

selecting those compounds which induce or block 
10 the production of reporter in the presence of said chimeric 
form of said receptor polypeptide. 

20. A method to identify response elements f. or 
receptor polypeptides according to Claim 13 comprising: 

15 assaying for the presence or absence of reporter 

protein upon contacting of cells containing a chimeric form 
of said receptor polypeptide and reporter vector with a 
compound which is a known agonist or antagonist for the 
receptor from which the ligand-binding domain of said 

20 chimeric form of said receptor polypeptide is derived; 

wherein said chimeric form of said receptor 
polypeptide comprises the DNA-binding domain of the 
receptor polypeptide and the amino-terminal and 
ligand-binding domains of at least one previously 

25 identified member of the steroid/thyroid superfamily of 
receptors ; 

wherein said reporter vector comprises: 

(a) a promoter that is operable in said 
cell, 

30 (b) a putative hormone response element, 

and 

(c) a DNA segment encoding a reporter 
protein , 
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wherein said reporter protein- 
encoding DNA segment is operatively 
linked to said promoter f or 
transcription of said DNA segment, and 
wherein said hormone response 
element is operatively linked to said 
promoter for activation thereof; and 
identifying those response elements for which the 
production of reporter is induced or blocked in the 
presence of said chimeric form of said receptor 
polypeptide. 



21. A method of testing a compound for its 
ability to selectively regulate transcription-activating 
15 effects of a specific receptor polypeptide, said method 
comprising: 

assaying for the presence or absence of reporter 
protein upon contacting of cells containing said receptor 
polypeptide and reporter vector with said compound; 

wherein said receptor polypeptide is 
characterized by being responsive to the presence of a 
known ligand for said receptor to regulate the 
transcription of associated gene(s) ; 

wherein said reporter vector comprises: 
25 (a) a promoter that is operable in said 

cell, 

(b) a hormone response element, and 

(c) a DNA segment encoding a reporter 
protein, 

wherein said reporter protein- 
encoding DNA segment is operatively 
linked to said promoter for 
transcription of said DNA segment, and 
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wherein said hormone response 
element is operatively linked to said 

: promoter— for— activation— thereof;— and- — 

assaying for the presence or absence of reporter 
5 protein upon contacting of cells containing chimeric 
receptor polypeptide and reporter vector with said 
compound; 

wherein said chimeric receptor polypeptide 
comprises the ligand binding domain of the 
10 receptor of Claim 13 and the DNA binding domain 

of said specific receptor; and thereafter 
selecting those compounds which induce or block 
the production of reporter, in the presence of said specific 
receptor, but are substantially unable to induce or- block 
15 the production of reporter in the presence of said chimeric 
receptor . 

22. A method according to Claim 21 wherein said 
contacting is carried out in the further presence of at 
20 least one agonist for said specific receptor. 
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